{"id":2619,"date":"2024-08-16T06:35:09","date_gmt":"2024-08-16T06:35:09","guid":{"rendered":"https:\/\/bynatree.com\/?p=2619"},"modified":"2024-08-16T06:35:09","modified_gmt":"2024-08-16T06:35:09","slug":"hypothetical-indexes-in-postgresql-enhancing-performance-without-the-overhead","status":"publish","type":"post","link":"https:\/\/divaind.com\/ie1\/2024\/08\/16\/hypothetical-indexes-in-postgresql-enhancing-performance-without-the-overhead\/","title":{"rendered":"Hypothetical Indexes in PostgreSQL: Enhancing Performance Without the Overhead"},"content":{"rendered":"<h2><span style=\"font-weight: 400;\">Introduction<\/span><\/h2>\n<p style=\"text-align: left; padding-left: 40px;\"><span style=\"font-weight: 400;\">Indexes are essential for speeding up query execution in PostgreSQL, but determining whether an index will be utilized in the execution plan can be challenging without actually creating it. However, on large tables, index creation can be time-consuming and resource-intensive. This is where virtual, or hypothetical, indexes come into play\u2014they allow you to assess the potential impact of an index without the overhead of actually building it.<\/span><\/p>\n<blockquote><p><span style=\"font-weight: 400;\">A virtual or hypothetical index is a simulated index that allows database administrators to predict query performance without physically creating the index. By analyzing the estimated query plan using this virtual index, they can assess the potential benefits before committing to a real index. This approach optimizes database performance by preventing unnecessary index creation, saving both time and resources.<\/span><\/p>\n<p><span style=\"font-weight: 400;\"><strong>HypoPG<\/strong> is a <strong>PostgreSQL extension<\/strong> that adds support for hypothetical or virtual indexes. Once the extension is created, you can start using it immediately. Hypothetical indexes created within one session are isolated and will not affect other sessions, ensuring they remain non-intrusive to the overall database performance.<\/span><\/p><\/blockquote>\n<p><span style=\"font-weight: 400;\"><strong>HypoPG<\/strong> supports the following access methods for hypothetical indexes:<\/span><b><\/b><\/p>\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li><b>btree<\/b><\/li>\n<li><b>brin<\/b><\/li>\n<li><b>hash<\/b><span style=\"font-weight: 400;\"> (requires PostgreSQL 10 or later)<\/span><\/li>\n<li><b>bloom<\/b><span style=\"font-weight: 400;\"> (requires the bloom extension)<\/span><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p style=\"padding-left: 40px;\"><span style=\"font-weight: 400;\">However GIST\/GIN Access methods are not supported<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Let&#8217;s walk through an example of how to use HypoPG to evaluate the potential impact of an index without actually creating it.<\/span><\/p>\n<h2><span style=\"font-weight: 400;\">Install the HypoPG Extension<\/span><\/h2>\n<p><span style=\"font-weight: 400;\">First, you need to install the <strong>HypoPG<\/strong> extension in your PostgreSQL database:<br \/>\n<\/span><\/p>\n<pre class=\"theme:coda-special-board lang:pgsql decode:true\">yum install hypopg_15\n-bash-4.2$ psql\npsql (15.8)\nType \"help\" for help.\npostgres=# create extension hypopg; \nCREATE EXTENSION\n<\/pre>\n<p><span style=\"font-weight: 400;\">Once installed, <strong>HypoPG<\/strong> allows you to create hypothetical indexes and analyze query performance as if the indexes physically exist.<\/span><\/p>\n<h2><span style=\"font-weight: 400;\">Create a hypothetical index<\/span><\/h2>\n<p><span style=\"font-weight: 400;\">Create a test table with random data<\/span><\/p>\n<pre class=\"theme:coda-special-board lang:pgsql decode:true\">postgres=# create table test(id int, name varchar);\nCREATE TABLE\npostgres=# insert into test select generate_series(1,1000000),md5(generate_series(1,1000000)::text);\nINSERT 0 1000000\n<\/pre>\n<p><span style=\"font-weight: 400;\">Now below is the execution plan of the simple query without any index on the id column<br \/>\n<\/span><\/p>\n<pre class=\"theme:coda-special-board lang:pgsql decode:true\">postgres=# explain select * from test where id = 1;\n                               QUERY PLAN\n------------------------------------------------------------------------\n Gather  (cost=1000.00..14542.43 rows=1 width=37)\n   Workers Planned: 2\n   -&gt;  Parallel Seq Scan on test  (cost=0.00..13542.33 rows=1 width=37)\n         Filter: (id = 1)\n(4 rows)\n<\/pre>\n<blockquote>\n<p style=\"padding-left: 40px;\"><span style=\"font-weight: 400;\">&#8220;Due to the lack of an index on the table, a full table scan is currently being performed, which can be inefficient. Adding a simple B-tree index could significantly improve query performance. To assess this potential optimization without making any changes to the actual database schema, we will use HypoPG. The <\/span><strong><em>hypopg_create_index()<\/em><\/strong><span style=\"font-weight: 400;\"> function enables us to create hypothetical indexes for performance comparison, allowing us to evaluate the impact of the index without altering the database.&#8221;<\/span><\/p>\n<\/blockquote>\n<p>&nbsp;<\/p>\n<pre class=\"theme:coda-special-board lang:pgsql decode:true\">postgres=# SELECT * FROM hypopg_create_index('CREATE INDEX ON test (id)') ;\n indexrelid |      indexname\n------------+----------------------\n      14264 | &lt;14264&gt;btree_test_id\n(1 row)\n<\/pre>\n<p><span style=\"font-weight: 400;\">The function returns two columns:\u00a0<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">the object identifier of the hypothetical index\u00a0<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">the generated hypothetical index name\u00a0<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">We can run the <strong>EXPLAIN<\/strong> again to see if <strong>PostgreSQL<\/strong> would use this <strong>index<\/strong><br \/>\n<\/span><\/p>\n<pre class=\"theme:coda-special-board lang:pgsql decode:true\">postgres=# explain select * from test where id = 1;\n                                     QUERY PLAN\n------------------------------------------------------------------------------------\n Index Scan using \"&lt;14264&gt;btree_test_id\" on test  (cost=0.05..8.07 rows=1 width=37)\n   Index Cond: (id = 1)\n(2 rows)\n<\/pre>\n<p><span style=\"font-weight: 400;\">As you can see the index is being used in the estimated execution plan of the query, so now users can create this index to reduce the execution time of the query.<\/span><\/p>\n<h2><span style=\"font-weight: 400;\">Hypopg Functions<\/span><\/h2>\n<p style=\"padding-left: 40px;\"><span style=\"font-weight: 400;\">Below are some of the functions provided by the hypoPG which are useful for working with hypoPG.<\/span><\/p>\n<p><em><b>hypopg_list_indexes<\/b><\/em><span style=\"font-weight: 400;\">: view that lists all hypothetical indexes that have been created<br \/>\n<\/span><\/p>\n<pre class=\"theme:coda-special-board lang:pgsql decode:true\">postgres=# select * from hypopg_list_indexes ;\n indexrelid |      index_name      | schema_name | table_name | am_name\n------------+----------------------+-------------+------------+---------\n      14264 | &lt;14264&gt;btree_test_id | public      | test       | btree\n(1 row)\n<\/pre>\n<p><em><b>hypopg_get_indexdef(oid)<\/b><\/em><span style=\"font-weight: 400;\"><em>:<\/em> function that lists the <strong>CREATE INDEX<\/strong> statement that would recreate a stored hypothetical index.<\/span><\/p>\n<pre class=\"theme:coda-special-board lang:pgsql decode:true\">postgres=# SELECT index_name, hypopg_get_indexdef(indexrelid) FROM hypopg_list_indexes;\n      index_name      |             hypopg_get_indexdef\n----------------------+----------------------------------------------\n &lt;14264&gt;btree_test_id | CREATE INDEX ON public.test USING btree (id)\n(1 row)\n<\/pre>\n<p><em><b>hypopg_relation_size(oid)<\/b><\/em><span style=\"font-weight: 400;\"><em>:<\/em> This function allows the users to know the estimated size of the index once it is created.<\/span><\/p>\n<pre class=\"theme:coda-special-board lang:pgsql decode:true\">postgres=# SELECT index_name, pg_size_pretty(hypopg_relation_size(indexrelid)) FROM hypopg_list_indexes;\n      index_name      | pg_size_pretty\n----------------------+----------------\n &lt;14264&gt;btree_test_id | 25 MB\n(1 row)\n\npostgres=# create index idx_id on test(id);\nCREATE INDEX\npostgres=# di+ idx_id\n                                       List of relations\n Schema |  Name  | Type  |  Owner   | Table | Persistence | Access method | Size  | Description\n--------+--------+-------+----------+-------+-------------+---------------+-------+-------------\n public | idx_id | index | postgres | test  | permanent   | btree         | 21 MB |\n(1 row)\n<\/pre>\n<p><span style=\"font-weight: 400;\">The index size is near to the estimated size of the index.<\/span><\/p>\n<ul>\n<li><b>hypopg_drop_index(oid)<\/b><span style=\"font-weight: 400;\">: function that removes the given hypothetical index<\/span><\/li>\n<li><b>hypopg_reset()<\/b><span style=\"font-weight: 400;\">: function that removes all hypothetical indexes<\/span><\/li>\n<\/ul>\n<h2><span style=\"font-weight: 400;\">Hypothetically hide existing indexes<\/span><\/h2>\n<blockquote>\n<p style=\"padding-left: 40px;\"><em><b>hypopg_hide_index(oid)<\/b><\/em><span style=\"font-weight: 400;\"><em>:<\/em> function that allows you to hide an index in the <strong>EXPLAIN<\/strong> output by using its OID. It returns true if the index was successfully hidden, and false otherwise. This will help to identify the impact of deleting one of the indexes from two indexes which have similar columns.<\/span><\/p>\n<\/blockquote>\n<pre class=\"theme:coda-special-board lang:pgsql decode:true\">postgres=# EXPLAIN SELECT * FROM test WHERE id = 1;\n                                     QUERY PLAN\n------------------------------------------------------------------------------------\n Index Scan using \"&lt;14264&gt;btree_test_id\" on test  (cost=0.05..8.07 rows=1 width=37)\n   Index Cond: (id = 1)\n(2 rows)\n\npostgres=#  select oid from pg_class where relname='idx_id';\n  oid\n-------\n 16535\n(1 row)\n\npostgres=# select hypopg_hide_index(16535);\n hypopg_hide_index\n-------------------\n t\n(1 row)\n\npostgres=#  EXPLAIN SELECT * FROM test WHERE id = 1;\n                               QUERY PLAN\n------------------------------------------------------------------------\n Gather  (cost=1000.00..14542.43 rows=1 width=37)\n   Workers Planned: 2\n   -&gt;  Parallel Seq Scan on test  (cost=0.00..13542.33 rows=1 width=37)\n         Filter: (id = 1)\n(4 rows)\n<\/pre>\n<ul>\n<li><b>hypopg_unhide_index(oid)<\/b><span style=\"font-weight: 400;\">: function that restore a previously hidden index in the EXPLAIN output by using its OID. It returns true if the index was successfully restored, and false otherwise.<\/span><\/li>\n<li><b>hypopg_unhide_all_index()<\/b><span style=\"font-weight: 400;\">: function that restore all hidden indexes and returns void.<\/span><\/li>\n<li><b>hypopg_hidden_indexes()<\/b><span style=\"font-weight: 400;\">: function that returns a list of OIDs for all hidden indexes.<\/span><\/li>\n<li style=\"text-align: left;\"><b>hypopg_hidden_indexes<\/b><span style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">: view that returns a formatted list of all hidden indexes.<\/span><\/span><\/li>\n<\/ul>\n<hr \/>\n<p style=\"text-align: center;\">Thanks for Reading<\/p>\n<p><span style=\"font-weight: 400;\">\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Introduction Indexes are essential for speeding up query execution in PostgreSQL, but determining whether an index will be utilized in the execution plan can be challenging without actually creating it. However, on large tables, index creation can be time-consuming and resource-intensive. This is where virtual, or hypothetical, indexes come into play\u2014they allow you to assess&hellip;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[28],"tags":[161,238,266],"class_list":["post-2619","post","type-post","status-publish","format-standard","hentry","category-postgresql","tag-hypopg","tag-performance","tag-postgresql","category-28","description-off"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/divaind.com\/ie1\/wp-json\/wp\/v2\/posts\/2619","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/divaind.com\/ie1\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/divaind.com\/ie1\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/divaind.com\/ie1\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/divaind.com\/ie1\/wp-json\/wp\/v2\/comments?post=2619"}],"version-history":[{"count":0,"href":"https:\/\/divaind.com\/ie1\/wp-json\/wp\/v2\/posts\/2619\/revisions"}],"wp:attachment":[{"href":"https:\/\/divaind.com\/ie1\/wp-json\/wp\/v2\/media?parent=2619"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/divaind.com\/ie1\/wp-json\/wp\/v2\/categories?post=2619"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/divaind.com\/ie1\/wp-json\/wp\/v2\/tags?post=2619"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}