Apr 11, 2024
Posted by
Ana Tavares
Nikolay Samokhvalov not only solves problems with SQL but also assists others in doing so. Having worked with the programming language for over 20 years, he is now committed to helping clients optimize query performance, PostgreSQL scalability, and change management.
His one advice for those who have recently started learning the language? Start big. Don’t use small tables; pick a sizeable environment where you can experiment. And don’t think twice if your job involves analyzing business metrics: modern SQL can solve all your problems.
I’m the founder and CEO/CTO of Postgres.ai, the company behind Database Lab Engine. We develop tools for those who work with PostgreSQL and a consulting on the topic (companies such as GitLab, Chewy, Mercury, Miro). We help them with query performance optimization, PostgreSQL scalability, and change management.
It was at the beginning of the 1990s at school. I went to my father’s work and saw a very old IBM. Right after that, I convinced my parents to buy me a simple computer. It was very similar to the ZX Spectrum. I started learning BASIC, assembly language, Pascal, C, and so on. Somehow I understood that I liked programming much more than anything else.
I have been using SQL for more than 20 years—since my university time.
At university, the field of database theory and, mainly, relational databases were my primary fields of interest. So, learning SQL (and SQL standard–SQL:1999, then called SQL:200n) was an obvious goal. Later, I started to work with Oracle and SQL Server. As a backend engineer, I was always focused on database-related tasks. Then I was CTO/COO/CEO of a few “social media” startups, and I used SQL mainly to analyze business metrics. Of course, by that time, I switched to open-source software – first of all, for economical reasons—and almost immediately it was Postgres (I managed to stay on MySQL for only one week).
“It is ridiculous that oftentimes queries are created by engineers who don’t have access to production”
Recently, I’ve founded and am growing Postgres.ai—we build tools for developers to help them improve their experience with SQL. Particularly, we improve SQL optimization and change management workflows using thin clones and database branching. To illustrate it, I have a meme:
In other words, people usually have issues with checking optimization ideas because non-production environments are far from perfect (they are small, outdated, used by colleagues, and so on). With Database Lab Engine, engineers can have many clones of a large-size database on a single machine and start checking any optimization ideas for their SQL queries. Sometimes, finally, they start learning how SQL behaves on large databases! It is ridiculous that oftentimes queries are created by engineers who don’t have access to production. So they cannot see how the execution of their own queries happens—we solve this problem.
I also continue helping people optimize SQL query performance and database migrations (schema changes). Check out my two talks at the recent PGCon and the article “Common DB schema change mistakes.”
So, I’m not only solving problems with SQL but also trying to help others to solve problems with SQL better—without downtime, with good performance, and making changes at a fast pace.
It’s my native programming language.
There is a very common problem in social media: for a user that has subscribed to many streams of data (such as blogs or social profiles), return N latest article (posts, tweets, etc.). We can solve it straightforwardly with a few lines of SQL, but these lines will have a poor performance on large data volumes (millions of users, thousands of subscriptions) in most cases.
At this point, many people give up and switch to NoSQL database management systems—just because of this problem. But this can be solved perfectly with modern SQL—with recursive common table expression (CTE), arrays, and various PostgreSQL tricks! The recipe was described long ago by Maxim Boguk (see this slide deck). I probably need to write an article about it too.
“Don’t learn SQL using small tables; it’s a mistake”
When I first applied this trick, it took two days to engineer it. The query had a couple of hundred lines (!), but the performance boost was astonishing, unlocking further growth for my projects. I remember query time dropped from approximately 5-10 seconds to around 10 milliseconds. The lesson was: learn modern SQL; it’s worth doing so.
Don’t learn SQL using small tables; it’s a mistake. If you do this, you won’t understand the performance part of SQL, and you’ll need 10 years to become an expert—this is a roadblock you want to eliminate.
Think about where you're going to test your ideas and what will be your playground: where you’ll check your crazy ideas, not hesitating and not disturbing your colleagues.
This environment must have several properties to give you the freedom to experiment:
Eventually, I hope, you’ll end up considering our Database Lab Engine—with it, anyone in your organization gets the essential freedom of experimenting. Remember Haki Benita’s question from the previous interview (“Where can a junior developer find a large database to experiment on?”) Now you have the answer!
I’m constantly learning. SQL is huge and permanently growing. I use PostgreSQL docs and SQL standard texts. I can also recommend Modern SQL and Use the Index, Luke for those who are starting to work with large databases and don’t yet know about LATERAL
, WITH RECURSIVE
, window functions, and so on. Markus Winand has great materials.
Our Database Lab Engine is the tool I recommend, of course. My 20+ years of career in the field of databases led me to develop it. The tips are simple—check our website, Postgres.ai, and see if you have problems optimizing SQL or testing database changes. If you see such issues, you should try Database Lab Engine.