{"id":2536,"date":"2023-03-29T17:27:05","date_gmt":"2023-03-29T17:27:05","guid":{"rendered":"https:\/\/osdbtechnologies.com\/?p=2536"},"modified":"2023-03-29T17:27:05","modified_gmt":"2023-03-29T17:27:05","slug":"essential-queries-for-postgresql-health-assessment","status":"publish","type":"post","link":"https:\/\/divaind.com\/ie1\/2023\/03\/29\/essential-queries-for-postgresql-health-assessment\/","title":{"rendered":"Essential Queries for PostgreSQL Health Assessment"},"content":{"rendered":"<p>In this post, we&#8217;ll look at a few crucial queries that are helpful for exploring into PostgreSQL problems and evaluating the health of an existing instance.<\/p>\n<h2 id=\"What-is-hit-ratio-of-the-database?\" data-renderer-start-pos=\"159\"><strong data-renderer-mark=\"true\">What is hit ratio of the database?<\/strong><\/h2>\n<pre class=\"lang:pgsql decode:true\">SELECT\ndatname,\n(\nblks_hit * 100 \/(blks_hit + blks_read)\n):: numeric as hit_ratio\nfrom\npg_stat_database\nWHERE\ndatname not in (\n'postgres', 'template0', 'template1'\n);<\/pre>\n<p>If the <strong data-renderer-mark=\"true\">hit ratio is less than 90%, there may be a problem with low allocation of shared buffers or queries doing large table scans<\/strong>. The<strong data-renderer-mark=\"true\"> hit ratio should be close to 100%<\/strong>. Boost shared buffers or tweak queries that do more IO.<\/p>\n<h2 id=\"What-is-commit-ratio-of-database?\" data-renderer-start-pos=\"573\"><strong data-renderer-mark=\"true\">What is commit ratio of database?<\/strong><\/h2>\n<pre class=\"lang:pgsql decode:true\">SELECT \n  datname, \n  round(\n    (\n      xact_commit :: float * 100 \/(xact_commit + xact_rollback)\n    ):: numeric, \n    2\n  ) as successful_xact_ratio \nFROM \n  pg_stat_database \nWHERE \n  datname not in (\n    'postgres', 'template0', 'template1'\n  );<\/pre>\n<p>We must engage with the<strong data-renderer-mark=\"true\"> application team to determine why there are so many transaction rollbacks if the commit percentage is less than 95%.<\/strong> Consider <strong data-renderer-mark=\"true\">a scenario where many of your transactions<\/strong> contain <strong data-renderer-mark=\"true\">DML which can result in fragmentation.<\/strong><\/p>\n<h2 id=\"Get-the-temp-file-usage-of-database?\" data-renderer-start-pos=\"1115\"><strong data-renderer-mark=\"true\">Get the temp file usage of database?<\/strong><\/h2>\n<pre class=\"lang:pgsql decode:true\">select \n  datname, \n  temp_files, \n  round(temp_bytes \/ 1024 \/ 1024, 2) as temp_filesize_MB \nfrom \n  pg_stat_database \nWHERE \n  datname not in (\n    'postgres', 'template0', 'template1'\n  ) \n  and temp_files &gt; 0;<\/pre>\n<p>Use the<strong data-renderer-mark=\"true\"><a href=\"https:\/\/www.postgresql.org\/docs\/current\/runtime-config-logging.html#:~:text=change%20this%20setting.-,log_temp_files,-(integer)\"> log_temp_files<\/a> parameter to log queries utilizing the temp files and modify the queries<\/strong> if you see that the temp files and bytes are high.<\/p>\n<h2 id=\"Frequency-of-checkpoints?\" data-renderer-start-pos=\"1515\"><strong data-renderer-mark=\"true\">Frequency of checkpoints?<\/strong><\/h2>\n<pre class=\"lang:pgsql decode:true \">select * from pg_stat_bgwriter;<\/pre>\n<p>There are two important columns <strong data-renderer-mark=\"true\">checkpoints_req, checkpoints_timed. <\/strong><span data-renderer-mark=\"true\">I<\/span><strong data-renderer-mark=\"true\">f the checkpoints_req is more than the checkpoints_timed, PostgreSQL is doing checkpoints due to the high WAL generation. <\/strong><\/p>\n<p>If the checkpoints are happening frequently it will cause more IO load on the machine so increase the max_wal_size parameter.<\/p>\n<p>Use below query to find the frequency of the checkpoints<\/p>\n<pre class=\"lang:pgsql decode:true \">WITH sub as (\n  SELECT \n    EXTRACT(\n      EPOCH \n      FROM \n        (now() - stats_reset)\n    ) AS seconds_since_start, \n    (\n      checkpoints_timed + checkpoints_req\n    ) AS total_checkpoints \n  FROM \n    pg_stat_bgwriter\n) \nSELECT \n  total_checkpoints, \n  seconds_since_start \/ total_checkpoints \/ 60 AS minutes_between_checkpoints \nFROM \n  sub;<\/pre>\n<p><span class=\"fabric-text-color-mark\" data-renderer-mark=\"true\" data-text-custom-color=\"#0747a6\">To determine how much data is being written for each <\/span><strong data-renderer-mark=\"true\"><span class=\"fabric-text-color-mark\" data-renderer-mark=\"true\" data-text-custom-color=\"#0747a6\">checkpoint<\/span><\/strong><span class=\"fabric-text-color-mark\" data-renderer-mark=\"true\" data-text-custom-color=\"#0747a6\">, set <\/span><strong data-renderer-mark=\"true\"><span class=\"fabric-text-color-mark\" data-renderer-mark=\"true\" data-text-custom-color=\"#0747a6\">log_checkpoints<\/span><\/strong><span class=\"fabric-text-color-mark\" data-renderer-mark=\"true\" data-text-custom-color=\"#0747a6\">. If not, we may use this query to determine how many blocks were written during the checkpoint.<\/span><\/p>\n<pre class=\"lang:pgsql decode:true \">select buffers_checkpoint\/(checkpoints_timed+checkpoints_req) from pg_stat_bgwriter<\/pre>\n<h2 id=\"Get-top-five-tables-with-highest-sequential-scans?\" data-renderer-start-pos=\"2764\"><strong data-renderer-mark=\"true\">Get top five tables with highest sequential scans?<\/strong><\/h2>\n<pre class=\"lang:pgsql decode:true\">SELECT \n  schemaname, \n  relname, \n  seq_scan, \n  seq_tup_read, \n  seq_tup_read \/ seq_scan as avg_seq_tup_read \nFROM \n  pg_stat_all_tables \nWHERE \n  seq_scan &gt; 0 \n  and pg_total_relation_size(schemaname || '.' || relname) &gt; 104857600 \n  and schemaname not like 'pg_%' \nORDER BY \n  5 DESC \nLIMIT \n  5;<\/pre>\n<p>This query will display all tables with more consecutive scans and a size greater than 100MB.<\/p>\n<h2 id=\"Find-tables-with-low-hot-updates?\" data-renderer-start-pos=\"3268\"><strong data-renderer-mark=\"true\">Find tables with low hot updates?<\/strong><\/h2>\n<p>To know the hot updates in PostgreSQL refer &#8211; https:\/\/www.postgresql.org\/docs\/current\/storage-hot.html.<\/p>\n<p>To identify the hot update ratio for tables use the below query.<\/p>\n<pre class=\"lang:pgsql decode:true \">SELECT \n  pg_stat_all_tables.schemaname || '.' || pg_stat_all_tables.relname AS TABLE_NAME, \n  pg_size_pretty(\n    pg_relation_size(relid)\n  ) AS table_size, \n  coalesce(\n    t.spcname, \n    (\n      SELECT \n        spcname \n      FROM \n        pg_tablespace \n      WHERE \n        oid = (\n          SELECT \n            dattablespace \n          FROM \n            pg_database \n          WHERE \n            datname = current_database()\n        )\n    )\n  ) AS tblsp, \n  seq_scan, \n  idx_scan, \n  n_tup_ins, \n  n_tup_upd, \n  n_tup_hot_upd, \n  n_tup_del, \n  coalesce(n_tup_ins, 0)+ 2 * coalesce(n_tup_upd, 0)- coalesce(n_tup_hot_upd, 0)+ coalesce(n_tup_del, 0) AS total, \n  (\n    coalesce(n_tup_hot_upd, 0):: float * 100 \/(\n      CASE WHEN n_tup_upd &gt; 0 THEN n_tup_upd ELSE 1 END\n    ):: float\n  ):: numeric(10, 2) AS hot_rate, \n  (\n    SELECT \n      v[1] \n    FROM \n      regexp_matches(\n        reloptions :: text, E 'fillfactor=(\\d+)'\n      ) AS r(v) \n    LIMIT \n      1\n  ) AS fillfactor \nFROM \n  pg_stat_all_tables \n  JOIN pg_class c ON c.oid = relid \n  LEFT JOIN pg_tablespace t ON t.oid = c.reltablespace \nWHERE \n  (\n    coalesce(n_tup_ins, 0)+ coalesce(n_tup_upd, 0)+ coalesce(n_tup_del, 0)\n  )&gt; 0 \n  AND pg_stat_all_tables.schemaname NOT IN ('pg_catalog', 'pg_global') \nORDER BY \n  total DESC \nLIMIT \n  50;<\/pre>\n<h2 id=\"Get-table-bloat-information?\" data-renderer-start-pos=\"4721\"><strong data-renderer-mark=\"true\">Get table bloat information?<\/strong><\/h2>\n<p>Bloat in tables or indexes slow down the performance by unnecessary IO. It is recommended to regularly check the bloat and do defragmentation by using pg_repack.<\/p>\n<pre class=\"lang:pgsql decode:true \">WITH constants AS \n(\n       \t-- define some constants for sizes of things\n       \t-- for reference down the query and easy maintenance\n       \n   SELECT\n      current_setting('block_size')::numeric AS bs,\n      23 AS hdr,\n      8 AS ma \n)\n,\nno_stats AS \n(\n       \t-- screen out table who have attributes\n       \t-- which dont have stats, such as JSON\n       \n   SELECT\n      table_schema,\n      table_name,\n              n_live_tup::numeric as est_rows,\n              pg_table_size(relid)::numeric as table_size     \n   FROM\n      information_schema.columns         \n      JOIN\n         pg_stat_user_tables as psut            \n         ON table_schema = psut.schemaname            \n         AND table_name = psut.relname         \n      LEFT OUTER JOIN\n         pg_stats         \n         ON table_schema = pg_stats.schemaname             \n         AND table_name = pg_stats.tablename             \n         AND column_name = attname     \n   WHERE\n      attname IS NULL         \n      AND table_schema NOT IN \n      (\n         'pg_catalog',\n         'information_schema'\n      )\n          \n   GROUP BY\n      table_schema,\n      table_name,\n      relid,\n      n_live_tup \n)\n,\nnull_headers AS \n(\n       \t-- calculate null header sizes\n       \t-- omitting tables which dont have complete stats\n       \t-- and attributes which aren't visible\n       \n   SELECT\n              hdr + 1 + (sum(\n      case\n         when\n            null_frac &lt;&gt; 0 \n         THEN\n            1 \n         else\n            0 \n      END\n) \/ 8) as nullhdr,         SUM((1 - null_frac)*avg_width) as datawidth,         MAX(null_frac) as maxfracsum,         schemaname,         tablename,         hdr, ma, bs     \n   FROM\n      pg_stats \n      CROSS JOIN\n         constants         \n      LEFT OUTER JOIN\n         no_stats             \n         ON schemaname = no_stats.table_schema             \n         AND tablename = no_stats.table_name     \n   WHERE\n      schemaname NOT IN \n      (\n         'pg_catalog', 'information_schema'\n      )\n              \n      AND no_stats.table_name IS NULL         \n      AND EXISTS \n      (\n         SELECT\n            1             \n         FROM\n            information_schema.columns                 \n         WHERE\n            schemaname = columns.table_schema                     \n            AND tablename = columns.table_name \n      )\n          \n   GROUP BY\n      schemaname,\n      tablename,\n      hdr,\n      ma,\n      bs \n)\n,\ndata_headers AS \n(\n       \t-- estimate header and row size\n       \n   SELECT\n              ma,\n      bs,\n      hdr,\n      schemaname,\n      tablename,\n              (datawidth + (hdr + ma - (\n      case\n         when\n            hdr % ma = 0 \n         THEN\n            ma \n         ELSE\n            hdr % ma \n      END\n)))::numeric AS datahdr,         (maxfracsum*(nullhdr + ma - (\n      case\n         when\n            nullhdr % ma = 0 \n         THEN\n            ma \n         ELSE\n            nullhdr % ma \n      END\n))) AS nullhdr2     \n   FROM\n      null_headers \n)\n, table_estimates AS \n(\n       \t-- make estimates of how large the table should be\n       \t-- based on row and page size\n       \n   SELECT\n      schemaname,\n      tablename,\n      bs,\n              reltuples::numeric as est_rows,\n      relpages * bs as table_bytes,\n          CEIL((reltuples*             (datahdr + nullhdr2 + 4 + ma -                 (\n      CASE\n         WHEN\n            datahdr % ma = 0                     \n         THEN\n            ma \n         ELSE\n            datahdr % ma \n      END\n)                 ) \/ (bs - 20))) * bs AS expected_bytes,         reltoastrelid     \n   FROM\n      data_headers         \n      JOIN\n         pg_class \n         ON tablename = relname         \n      JOIN\n         pg_namespace \n         ON relnamespace = pg_namespace.oid             \n         AND schemaname = nspname     \n   WHERE\n      pg_class.relkind = 'r' \n)\n, estimates_with_toast AS \n(\n       \t-- add in estimated TOAST table sizes\n       \t-- estimate based on 4 toast tuples per page because we dont have\n       \t-- anything better.  also append the no_data tables\n       \n   SELECT\n      schemaname,\n      tablename,\n              TRUE as can_estimate,\n              est_rows,\n              table_bytes + ( coalesce(toast.relpages, 0) * bs ) as table_bytes,\n              expected_bytes + ( ceil( coalesce(toast.reltuples, 0) \/ 4 ) * bs ) as expected_bytes     \n   FROM\n      table_estimates \n      LEFT OUTER JOIN\n         pg_class as toast         \n         ON table_estimates.reltoastrelid = toast.oid             \n         AND toast.relkind = 't' \n)\n,\ntable_estimates_plus AS \n(\n   -- add some extra metadata to the table data\n   -- and calculations to be reused\n   -- including whether we cant estimate it\n   -- or whether we think it might be compressed\n       \n   SELECT\n      current_database() as databasename,\n                  schemaname,\n      tablename,\n      can_estimate,\n                  est_rows,\n                  \n      CASE\n         WHEN\n            table_bytes &gt; 0                 \n         THEN\n            table_bytes::NUMERIC                 \n         ELSE\n            NULL::NUMERIC \n      END\n                      AS table_bytes,             \n      CASE\n         WHEN\n            expected_bytes &gt; 0                 \n         THEN\n            expected_bytes::NUMERIC                 \n         ELSE\n            NULL::NUMERIC \n      END\n                          AS expected_bytes,             \n      CASE\n         WHEN\n            expected_bytes &gt; 0 \n            AND table_bytes &gt; 0                 \n            AND expected_bytes &lt;= table_bytes                 \n         THEN\n(table_bytes - expected_bytes)::NUMERIC                 \n         ELSE\n            0::NUMERIC \n      END\n      AS bloat_bytes     \n   FROM\n      estimates_with_toast     \n   UNION ALL\n       \n   SELECT\n      current_database() as databasename,\n              table_schema,\n      table_name,\n      FALSE,\n              est_rows,\n      table_size,\n              NULL::NUMERIC,\n      NULL::NUMERIC     FROM no_stats \n)\n,\nbloat_data AS \n(\n       \t-- do final math calculations and formatting\n       \n   select\n      current_database() as databasename,\n              schemaname,\n      tablename,\n      can_estimate,\n              table_bytes,\n      round(table_bytes \/ (1024 ^ 2)::NUMERIC, 3) as table_mb,\n              expected_bytes,\n      round(expected_bytes \/ (1024 ^ 2)::NUMERIC, 3) as expected_mb,\n              round(bloat_bytes*100 \/ table_bytes) as pct_bloat,\n              round(bloat_bytes \/ (1024::NUMERIC ^ 2), 2) as mb_bloat,\n              table_bytes,\n      expected_bytes,\n      est_rows     \n   FROM\n      table_estimates_plus \n)\n-- filter output for bloated tables\nSELECT\n   databasename,\n   schemaname,\n   tablename,\n       can_estimate,\n       est_rows,\n       pct_bloat,\n   mb_bloat,\n       table_mb \nFROM\n   bloat_data \t-- this where clause defines which tables actually appear\n   -- in the bloat chart\n   -- example below filters for tables which are either 50%\n   -- bloated and more than 20mb in size, or more than 25%\n   -- bloated and more than 1GB in size\nWHERE\n   (\n      pct_bloat &gt;= 50 \n      AND mb_bloat &gt;= 20 \n   )\n       \n   OR \n   (\n      pct_bloat &gt;= 25 \n      AND mb_bloat &gt;= 1000 \n   )\nORDER BY\n   pct_bloat DESC;<\/pre>\n<h2 id=\"Get-list-of-unused-indexes?\" data-renderer-start-pos=\"10500\"><strong data-renderer-mark=\"true\">Get list of unused indexes?<\/strong><\/h2>\n<p>The following query will return any unused indexes which are not part of any constraint.<\/p>\n<pre class=\"lang:pgsql decode:true \">SELECT s.schemaname,\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 s.relname AS tablename,\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 s.indexrelname AS indexname,\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 pg_relation_size(s.indexrelid) AS index_size\nFROM pg_catalog.pg_stat_user_indexes s\n\u00a0\u00a0 JOIN pg_catalog.pg_index i ON s.indexrelid = i.indexrelid\nWHERE s.idx_scan = 0\u00a0\u00a0\u00a0\u00a0\u00a0 \n\u00a0 AND 0 &lt;&gt;ALL (i.indkey)\u00a0 \n\u00a0 AND NOT i.indisunique\u00a0\u00a0 \n\u00a0 AND NOT EXISTS\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 (SELECT 1 FROM pg_catalog.pg_constraint c\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 WHERE c.conindid = s.indexrelid)\n\u00a0 AND NOT EXISTS\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 (SELECT 1 FROM pg_catalog.pg_inherits AS inh\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 WHERE inh.inhrelid = s.indexrelid)\nORDER BY pg_relation_size(s.indexrelid) DESC;<\/pre>\n<p id=\"We-appreciate-you-taking-the-time-to-explore-the-content-above.-I-hope-the-information-served-the-reason-for-which-you-sought-out-the-blog.[hardBreak]Comments-on-how-to-make-the-blog-better-are-indeed-very-appreciated.-If-you-have-any-questions,-suggestions,-or-criticism,-kindly-email-us.\" data-renderer-start-pos=\"12257\"><span class=\"fabric-text-color-mark\" data-renderer-mark=\"true\" data-text-custom-color=\"#0747a6\">We appreciate you taking the time to explore the content above. I hope the information served the reason for which you sought out the blog.<\/span><\/p>\n<p data-renderer-start-pos=\"12257\"><span class=\"fabric-text-color-mark\" data-renderer-mark=\"true\" data-text-custom-color=\"#0747a6\">Comments on how to make the blog better are indeed very appreciated. If you have any questions, suggestions, or criticism, kindly email us.<\/span><button class=\"css-779anb\" aria-label=\"Copy link to heading\"><\/button><\/p>\n<p data-renderer-start-pos=\"12538\">To be informed about all of our content, subscribe now.<\/p>\n<h5 id=\"To-be-informed-about-all-of-our-content,-subscribe-now.\" data-renderer-start-pos=\"12540\"><\/h5>\n","protected":false},"excerpt":{"rendered":"<p>In this post, we&#8217;ll look at a few crucial queries that are helpful for exploring into PostgreSQL problems and evaluating the health of an existing instance. What is hit ratio of the database? SELECT datname, ( blks_hit * 100 \/(blks_hit + blks_read) ):: numeric as hit_ratio from pg_stat_database WHERE datname not in ( &#8216;postgres&#8217;, &#8216;template0&#8217;,&hellip;<\/p>\n","protected":false},"author":1,"featured_media":2538,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[28],"tags":[276],"class_list":["post-2536","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-postgresql","tag-postgresql-health-check","category-28","description-off"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/divaind.com\/ie1\/wp-json\/wp\/v2\/posts\/2536","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/divaind.com\/ie1\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/divaind.com\/ie1\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/divaind.com\/ie1\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/divaind.com\/ie1\/wp-json\/wp\/v2\/comments?post=2536"}],"version-history":[{"count":0,"href":"https:\/\/divaind.com\/ie1\/wp-json\/wp\/v2\/posts\/2536\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/divaind.com\/ie1\/wp-json\/wp\/v2\/media\/2538"}],"wp:attachment":[{"href":"https:\/\/divaind.com\/ie1\/wp-json\/wp\/v2\/media?parent=2536"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/divaind.com\/ie1\/wp-json\/wp\/v2\/categories?post=2536"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/divaind.com\/ie1\/wp-json\/wp\/v2\/tags?post=2536"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}