How Notado Syncs Data from PostgreSQL to Meilisearch
An example of how to write your own PostgreSQL-Meilisearch sync service
After sharing my last post about Notado’s migration away from Kubernetes on Digital Ocean to Fly.io, I received an emailing inquiring about notado-postgres-listener
, aka. the caretaker.
Hi there! I found your recent blog post via Hacker News, and it was very interesting — but one throwaway reference particularly caught my eye, and I wanted to send you an email and follow up on it. You mentioned that a part of your system is “notado-postgres-listener”, a Go microservice that sends content from PostgreSQL to a Meilisearch instance. I just discovered Meilisearch, and I’m considering using it for a project, but I wasn’t sure what was the best way to keep it in sync with the data in my Postgres database. It sounds like you’ve found (built) a robust system for doing that yourself!
Can you tell me a bit about this system, so I know what’s involved in building a similar one as well? I don’t know if you’re interested in open sourcing the code, but if you are, that would be amazing! But if not, a text description would still be super helpful. I’m just looking for some advice from someone who has built this already!
Hopefully I can cut the difference between open sourcing the code and providing a text description with some Golang pseudocode.
There are already many great tutorials and SQL examples out there for creating a PostgreSQL trigger that will send out JSON notifications on inserts to a table, so I won’t cover that here.
If you’re reading this article and all of this is fairly new to you, please note that the listener that I’ll describe below will not receive anything until you have created a trigger in your database that sends out notifications.
Connecting to Meilisearch
The first thing you’ll need to do is connect to a running Meilisearch instance and ensure that your destination index is created and configured according to your needs.
Listening to PostgreSQL Notifications
The next thing will be to create a listener that will receive notifications from a PostgreSQL channel. Let’s assume for the sake of simplicity that we have a table called “some-table” and that the trigger which runs whenever a row is inserted emits a notification to a channel also named “some-table”.
The lib/pq
library is great for this and you only really need to concern yourself with the NewListener function.
NewListener creates a new database connection dedicated to LISTEN / NOTIFY.
name should be set to a connection string to be used to establish the database connection (see section "Connection String Parameters" above).
minReconnectInterval controls the duration to wait before trying to re-establish the database connection after connection loss. After each consecutive failure this interval is doubled, until maxReconnectInterval is reached. Successfully completing the connection establishment procedure resets the interval back to minReconnectInterval.
The last parameter eventCallback can be set to a function which will be called by the Listener when the state of the underlying database connection changes. This callback will be called by the goroutine which dispatches the notifications over the Notify channel, so you should try to avoid doing potentially time-consuming operations from the callback.
Sending New Data from PostgreSQL to Meilisearch
At this point we have the two key components - a connection to a Meilisearch instance, and a listener receiving notifications whenever inserts are made in to a PostgreSQL table. The next step is to send a request to Meilisearch whenever a notification is received on the listener.
Going a Little Farther
We now have something functional that will sync new data from PostgreSQL to Meilisearch, but it’s quite likely that you’ll want to be able to sync older data as well at some point. This might be because your VM blew up and you need to fully rehydrate your Meilisearch indices, or maybe because you had a little downtime, and you want to sync the last 30 minutes’ worth of data just to be sure nothing is missing from your Meilisearch indices.
You could consider wrapping this listener in a simple internal HTTP service which exposes a couple of rehydration routes, maybe one to fully rehydrate your indices and another one where you can specify how many minutes or hours you want to go back. This could probably be a CLI tool as well.
In either case, the approach will be the same; run a query to get the rows within the desired timeframe, then insert them into your Meilisearch indices.
Depending on the complexity of what you’re trying to build, you may need more than this or you may need less than this. Whatever your use-case, this article should provide you with enough of a starting point to crack open your editor and start building.
Can’t We Just Use PostgreSQL for Full Text Search?
Some readers may be thinking “Why go to all this trouble? PostgreSQL already does full text search!”
Back in 2020, Notado actually was using PostgreSQL for full text search!
As I have shared previously, Notado is the culmination of my desire for a content-first bookmarking service that addresses the frustrations that I experienced with other bookmarking services that treated content as optional, additional metadata.
Although I write these articles in English, I am not a monolingual Anglophone, and part of the reason why Notado’s design has excellent support for saved content in RTL (right-to-left) scripts is that I myself often save content in languages written in RTL scripts (and in languages written using the Latin alphabet with various diacritics, and other languages that don’t use the Latin alphabet at all).
After being left disappointed by the quality of PostgreSQL’s multi-lingual full text search capabilities, I decided to try out Meilisearch on a whim, and I was immediately blown away by the quality of the results returned (especially for RTL languages).
This was two years ago, and in my opinion Meilisearch has only improved since then.
If you are a solo developer, you’ll definitely appreciate the developer-friendly documentation provided on the Meilisearch website, which can easily take you from exploring data on your local development environment to deploying a Meilisearch instance to a production environment within a few days.
Thanks for sticking around until the end! If you want to get in touch, my Twitter handle is @notado_app and you can also find me in the Notado Discord server.
If you would like to reach out by email or request a technical article on a different part of Notado’s technology stack, you can reach me at hello at notado dot app.
If you are interested in what I read in order to come up with solutions like notado-postgres-listener
, you can check out my Software Development feed on Notado, which you can also subscribe to via RSS.
Alternatively, if you’ve had enough technical reading for today, I also use Notado to curate feeds on Addiction (RSS), Capitalism (RSS), and Mental Health (RSS).
The next poor soul who would find this article after struggling with meilisync can publish the Golang `syncer` described here :)
Great writeup!
Currently I'm comparing MongoDB (with builtin search) and PG+meilisearch (with synchronization you described in this post) ...