Database Tuning at Zerodha - India's Largest Stock Broker
Resumo
TLDRPidato iki saka anggota tim teknik Zerodha, sing nerangake babagan pengalaman ing ngembangake lan ngatur basis data nggunakake PostgreSQL, khusus kanggo ngatur data finansial kritis. Dheweke nerangake bab carane sistem basa data Zerodha diatur, kalebu sharding data adhedhasar taun finansial lan panggunaan lapisan cache tambahan kanggo ngoptimalake kinerja basis data. Dheweke uga nekanake pentinge nemokake keseimbangan ing usage indexing lan materialized views kanggo ngilangi bottlenecks ing query basis data. Salah siji conto praktik apik sing diwenehake yaiku ngaktifake indeks parsial lan ora ngandelake indexing sing berlebihan. Dheweke uga nandhesake pentinge ngerti perencana query lan panggunaan denormalisasi kanggo ngelidiki bottlenecks kinerja. Pembicara nuduhake cara nuduhake pengalaman praktis sing bener-bener bisa ditindakake kanggo ngukur usaha sukses tanpa over-engineering solusi lan nyinggung keperluan kanggo fleksibel karo database.
Conclusões
- 🍽️ Acara diadakan sawisé makan siang, njamin peserta supaya tetep waspodo.
- 📉 Zerodha, broker saham fintech, nggunakake PostgreSQL kanggo ngatur data finansial kritis.
- 🔍 Pengalaman pribadi lan pelajaran penting kanggo manajemen basis data.
- 🗄️ Sharding adhedhasar taun finansial supaya data bisa ngatur beban.
- 💾 Lapisan cache kanggo ngoptimalake pangolahan data sewenang-wenang.
- 🤝 Puji kanggo komunitas PostgreSQL lan dokumentasi sing apik.
- 🛠️ Praktik indexing parsial kanggo kinerja database lebih baik.
- 🛡️ PostgreSQL dibuktekake ketahanan tanpa ngidini log file overflow.
- 🚀 Materialized views lan denormalising kanggo ngoptimalake query.
- 🔧 Konfigurasi manual vakum kanggo jadwal impor data irreguler.
Linha do tempo
- 00:00:00 - 00:05:00
Pembicara miwiti presentasi marang basa subyek babagan carane Zerodha nggunakake Postgres kanggo ngatur data. Dheweke nyoroti manawa piwulang sing diduweni bisa uga ora bisa ditrapake kanggo pihak liya, nyatakake yen database kudu spesifik lan ora generik. Dheweke uga menehi ucapan terima kasih marang tim pengembangan Postgres amarga dokumentasi sing apik lan ketahanan saka Postgres.
- 00:05:00 - 00:10:00
Sejarah panggunaan Postgres ing Zerodha diwiwiti nalika perusahaan iki anyar dibangun lan ngimpor sekitar 150 MB data saben dina. Sawetara masalah sing diadhepi yaiku indexing berlebihan, file log overflow, lan crash, sing kabeh nyebabake akeh masalah sing kudu diatasi. Piwulang sing diduweni yaiku nyetel desain skema, ngoptimalake aplikasi supaya cocog karo database, lan cepet mutusake masalah kanggo paningkatan sing efisien.
- 00:10:00 - 00:15:00
Pembicara ngandhani babagan ngelola data gedhe, nyatakake yen istilah 'big data' kadhangkala ora relevan karo kabeh organisasi. Dheweke mromosikaken pentingé panggunaane index sing selektif, denormalisasi data supaya querying luwih gampang, lan nggunakake views materialisasi kanggo kinerja sing luwih apik.
- 00:15:00 - 00:20:00
Manajemen query lan optimisasi dadi topik utama, kanthi eksplorasi panggunaan indexing sing selektif, denormalisasi lan views materialisasi supaya queries liyane efisien. Pembicara ngandani babagan pentingé ngerti planner query lan nggawe tuning database sing fokus marang tabel individu lan queries tartamtu.
- 00:20:00 - 00:25:00
Ngerti planner query lan pigura data luwih jero ningkatake kinerja database. Dheweke uga ngomong babagan eksperimen karo autovacuum Postgres, biasane nyimpulake supaya dipateni lan dikelola kanthi manual, amarga kadang ora bisa ditrapake ing konteks khusus. Nawakake nostalgia amarga piwulang iki dijupuk liwat pengalaman.
- 00:25:00 - 00:30:00
Postgres uga digunakake ing caching ing Zerodha kanthi nggawe database sekunder kanggo caching data sing akeh diakses. Implementasi iki ngidini sistem utama ora akeh beban, lan iki dadi struktur caching transaction supaya optimalisasi data cepet ing wektu dagang.
- 00:30:00 - 00:35:00
Presentasi njelasake struktur caching lan proses query sing digunakake kanggo ngurangi beban ing database utama nalika nggampangake para pedagang. Pembicara menerangake cara partitioning data adhedhasar taun fiskal lan kegunaan database asing kanggo manajemen data skala gedhe.
- 00:35:00 - 00:44:36
Pembicara tutup presentasi kanthi nekanake kekarepan kanggo nguji lan nggunakake solusi database sing paling apik kanggo konteks tartamtu, ora bias banget marang Postgres utawa sistem liyane. Dheweke nuduhake yen sawetara solusi sing ora di-overengineer mujudake kemajuan sing signifikan lan stabilitas kanggo Zerodha.
Mapa mental
Perguntas frequentes
Apa Zerodha iku?
Zerodha yaiku broker saham finansial teknologi India, kadhangkala diarani Robin Hood saka India.
Kapan pasar saham dibuka ing India?
Pasar saham ing India dibuka saka jam 9:15 nganti 3:30 sore saben dina.
Apa sing nggawe PostgreSQL istimewa miturut pembicara?
PostgreSQL dikenal amarga dokumentasi sing apik lan komunitas sing ramah, uga kekuwatan lan ketahanan panggunaan kanggo data finansial kritis.
Kepiye cara Zerodha nyimpen lan ngimpor buku perdagangan?
Zerodha ngimpor buku perdagangan saben wengi kanggo ngitung informasi finansial kayata rugi laba lan buku besar.
Apa jensi lapisan cache sing digunakake Zerodha?
Zerodha nggunakake lapisan cache basis data PostgreSQL tambahan sing disinkronake karo basis data utama kanggo nyimpen data sauntara.
Apa sebabane Zerodha ora luwih milih replikasi DB tradisional?
Zerodha ora nganggo replikasi Master-Slave nanging nggunakake lisensi backup lan simpenan arsip data ing S3 supaya bisa mbalekake kanthi cepet yen ana masalah.
Kepiye carane Zerodha ngatur sharding data?
Zerodha sekat basis data adhedhasar taunan finansial kanthi nggunakake pengiris data asing kanggo nyambungake panggunaan data antar server.
Kenapa Zerodha mutusake kanggo ora ngaktifake autovakum kanthi otomatis ing PostgreSQL?
Autovakum dikonfigurasi manual amarga jadwal impor data sing ora teratur ora ngidini autovakum otomatis kanggo ngepel. ja uga mbantu ngatur sumber daya server kanthi cepet.
Apa peran konsol ing sistem Zerodha?
Konsol iku platform backend sing ngimpor lan nyimpen buku perdagangan kanggo ngitung informasi finansial kayata rugi laba lan buku besar pangguna.
Kepiye Zerodha nggunakake indeksing ing PostgreSQL?
Zerodha nggunakake indeks parsial lan ora overindexing supaya bisa tetep efisien ing pamrosesan query.
Ver mais resumos de vídeos
- 00:00:00okay um first of all uh good afternoon
- 00:00:03everyone uh I hope the lunch was good uh
- 00:00:06but obviously not too good so that you
- 00:00:08don't sleep off while I give this talk
- 00:00:11uh and welcome to my presentation of how
- 00:00:14we use postris in
- 00:00:16zeroda uh and what what have we learned
- 00:00:19from it and our mistakes our experiences
- 00:00:22everything and where we are right
- 00:00:25now so setting up the context for the
- 00:00:28talk um to quot my favorite Salman B
- 00:00:32movie race three our learnings are our
- 00:00:34learnings none of your
- 00:00:36learnings uh what it means is that
- 00:00:38everything that I'm going to speak about
- 00:00:40here is something that we have learned
- 00:00:42in our experience in the context of the
- 00:00:45data that how we use how we import it
- 00:00:47might not apply to even one person in
- 00:00:50this room and that is how databases
- 00:00:52should be it it should not be extremely
- 00:00:54generic either um you might disagree or
- 00:00:57be triggered by how we news postgress
- 00:01:01and that is okay I have been told by uh
- 00:01:04Kash our CTO to not make any jokes even
- 00:01:07pg3
- 00:01:10months uh little bit about me uh I've
- 00:01:12been at zeroda for since day one of Tech
- 00:01:15Team um as the 10x engineer uh these are
- 00:01:19all memes that we have internally about
- 00:01:21each other and been managing the
- 00:01:24backend uh full stack backend for the
- 00:01:28entire time I've been at zeroda gone
- 00:01:31through all possible databases um my SQL
- 00:01:35postest redis uh mongod DV uh click
- 00:01:40house cockroach the list is
- 00:01:43endless uh and and before I get into
- 00:01:47this talk I first of all like to I mean
- 00:01:50say thanks to the core team of postgress
- 00:01:54uh because I've come across multiple
- 00:01:56languages databases softwares Force or
- 00:02:00Enterprise but I don't think there has
- 00:02:03been anyone better at
- 00:02:06documenting their features as well as
- 00:02:08postgress has done I don't think there
- 00:02:10is anyone that has a better blueprint of
- 00:02:13what they want to do in their future
- 00:02:15updates like postgress has done I don't
- 00:02:17think there is I might be wrong here
- 00:02:19again because as I said it's our
- 00:02:21learnings but I don't think there is
- 00:02:22anything as resilient as postrace has
- 00:02:24been for us um and we have done
- 00:02:28ridiculous things with it and this just
- 00:02:30worked uh from upgrading from postgress
- 00:02:348 is where we started to postgress uh we
- 00:02:37right now at
- 00:02:38PG-13 uh and it has the updation has
- 00:02:42never caused an issue no data loss
- 00:02:44nothing and that is U like cannot be
- 00:02:48more thankful to the code development
- 00:02:50team of postgress and the community of
- 00:02:52postgress which has always been super
- 00:02:54nice in answering any of our doubts on
- 00:02:56the slack
- 00:02:57channels so
- 00:03:00uh history of uh postgress usage in
- 00:03:03seroa we started out uh first of all let
- 00:03:06me set a bit of context of how zeroda
- 00:03:09imports or uses its data and maybe that
- 00:03:11will be helpful in understanding why we
- 00:03:13do things with postris the way we do
- 00:03:16it the as you know V zeroda is an
- 00:03:21fintech Indian broker maybe I should
- 00:03:23have introduce zeroda first I don't
- 00:03:24think everyone knows uh what zeroda is
- 00:03:27so we are a stock broker uh we uh Robin
- 00:03:32Hood of India or Robin Hood is zeroda of
- 00:03:33us uh we deal with stock market and uh
- 00:03:38we
- 00:03:40import trade books we basically build a
- 00:03:43software for people to trade on so which
- 00:03:44means that we have to deal with all
- 00:03:46kinds of financial information and it
- 00:03:48also means Computing a lot of financial
- 00:03:50information like pnl of uh profit and
- 00:03:53loss of users how much The Ledger of
- 00:03:55users how much money they have
- 00:03:57transferred in transferred out all
- 00:03:58critical financial information that we
- 00:04:00store and we use postgress for
- 00:04:02it uh markets are open from 9:15 to 3
- 00:04:083:30 every day after that M6 is open but
- 00:04:11I don't think we ever cared a lot about
- 00:04:14it but uh yeah markets are open from
- 00:04:169:15 to 330 for majority for most of our
- 00:04:19Traders um and we our systems that we
- 00:04:24have built some of them are read only
- 00:04:26throughout the day and become right only
- 00:04:29at night
- 00:04:30many of many of the systems that are
- 00:04:32built are usually read and write
- 00:04:34throughout the day and night but our
- 00:04:36systems are a bit different than that
- 00:04:37and the systems that I have worked on uh
- 00:04:40we have a trading platform called kite
- 00:04:43uh which has a transactional DB which
- 00:04:46again uses postris that is a read write
- 00:04:48throughout the day but console which is
- 00:04:51our backend back office platform where
- 00:04:54all the trade books all the information
- 00:04:56regarding anything the user has done
- 00:04:58throughout the day on our Trading
- 00:04:59platform gets imported in that import
- 00:05:03happens at night that is the rights of
- 00:05:05bulk rights happen at night but majority
- 00:05:08of it it remains a readon platform
- 00:05:10throughout the day with very few rights
- 00:05:12so that is the context on which we built
- 00:05:15our schemas our queries our databases
- 00:05:17and how we
- 00:05:19scale so uh we started off with
- 00:05:21importing around uh so when I joined
- 00:05:24zeroa used to have 20,000 clients um not
- 00:05:28even all of them are active and we used
- 00:05:30to import around 150 MBS of data per day
- 00:05:35at best and I used to have uh I am
- 00:05:40saying I a lot here because at point of
- 00:05:42time it was just two or three of us uh I
- 00:05:44mean if you have read our blogs you
- 00:05:46would know that we are a very lean very
- 00:05:47small team and we still have remained so
- 00:05:50like that so I used to face a lot of
- 00:05:52issues with scaling even that 100 MB of
- 00:05:55data when we started out with um when I
- 00:05:59look back back at it lot of things that
- 00:06:00I did was extremely obviously dumb uh
- 00:06:03lack of understanding of how data Works
- 00:06:05understanding of how databases work um
- 00:06:08over indexing issues under indexing
- 00:06:11everything every possible thing that you
- 00:06:12can think of can go wrong in a database
- 00:06:15um for example let's say uh the log
- 00:06:18files overflowing and causing the
- 00:06:21database to crash so everything that can
- 00:06:24possibly go wrong uh has gone wrong with
- 00:06:26us we have learned from it uh We've
- 00:06:28improved our softwares way we deal with
- 00:06:32uh storing our own data so started off
- 00:06:35with 100 MB uh 100 MB failed uh there
- 00:06:38was postgress 8 uh improved on our
- 00:06:41schemas improved our schema design
- 00:06:43improved the way an app has to built on
- 00:06:46has to be built on top of um our
- 00:06:49databases not rewrote our apps multiple
- 00:06:51times uh again if you have read any of
- 00:06:54our posts you would know that we we
- 00:06:56rewrite a lot of our things multiple
- 00:06:59times over over and over again um it is
- 00:07:02mundan might be but it solves it solves
- 00:07:05a lot of headache for us by removing uh
- 00:07:08Legacy code Legacy issues and I would
- 00:07:11say Legacy schemas too because you might
- 00:07:13have started with a schema that doesn't
- 00:07:16make sense right now uh because your
- 00:07:18queries have changed the way you deal
- 00:07:19with the data has changed so we end up
- 00:07:23rewriting everything we know that
- 00:07:24nothing is constant Everything Will
- 00:07:26Change needs to change everything will
- 00:07:27break and that's okay we are okay with
- 00:07:29it uh we currently deal with hundreds of
- 00:07:32GBS of import every single day um uh
- 00:07:36absolutely no issues at all I mean there
- 00:07:38are plenty of issues but postgress has
- 00:07:40worked fine for us till now though we
- 00:07:44have other plans of doing other things
- 00:07:45with it but till now again nothing as
- 00:07:49resilient as good as postgress has been
- 00:07:52for us so how do we
- 00:07:56manage uh this big amount of data I've
- 00:08:00put a question mark there
- 00:08:01because when we when we started out um
- 00:08:06understanding our data better I remember
- 00:08:08this was six years back probably I
- 00:08:11remember sitting with Nan our CEO and
- 00:08:13even Kash and Nan used to be like so so
- 00:08:16we are very close to Big Data right
- 00:08:18because big data used to be this fancy
- 00:08:21term at that point of time I never
- 00:08:23understood what Big Data meant uh I
- 00:08:26assumed that it's just a nice looking
- 00:08:28term on your assume right you you're
- 00:08:30you're managing Big Data um eventually
- 00:08:33we uh eventually I guess we all realize
- 00:08:37that all of that is pretty much hogwash
- 00:08:39uh there are companies which need big
- 00:08:42data there are companies which don't
- 00:08:43need big data you don't have to be a
- 00:08:46serious engineering company if you I
- 00:08:48mean if you don't need to have big data
- 00:08:50to be a serious engineering company you
- 00:08:52can make do with little less data so um
- 00:08:56I'm going to be this talk is probably
- 00:08:58going to be a bit of an over overview of
- 00:09:00how we manage our data till now but um I
- 00:09:03glad to I'll be more than glad to take
- 00:09:05questions at the end of it if there are
- 00:09:07more doubts or anything else uh first
- 00:09:10thing is uh index uh but don't overdo it
- 00:09:14so when we started
- 00:09:16out I I thought that indexing was like a
- 00:09:19fullprof plan to solve everything that
- 00:09:22is there realized it much later that
- 00:09:24indexing itself takes a lot of space
- 00:09:27indexing in itself uh uh you can't index
- 00:09:31for every query that you write you need
- 00:09:33to First understand that there are some
- 00:09:35queries that need to be fast and some
- 00:09:36queries that you can afford it to be
- 00:09:38slow and that's okay so how we have
- 00:09:42designed our systems is the queries that
- 00:09:44um
- 00:09:46are the the the number of queries are
- 00:09:49higher for let's say a particular set of
- 00:09:50columns those columns are indexed and uh
- 00:09:54the columns that are not indexed they
- 00:09:56might be queried and but we don't index
- 00:09:58them at all and that's okay those
- 00:10:00queries might take a long enough long
- 00:10:02time but they're not user facing they
- 00:10:05are backend reports that it generated
- 00:10:07over time not everything has to happen
- 00:10:09in 1 second or half a millisecond or
- 00:10:11stuff like that so we're very aware of
- 00:10:12that when we index we use partial
- 00:10:14indexes everywhere U that's another
- 00:10:16thing that we learned that uh even if
- 00:10:18you're indexing a column you can partial
- 00:10:21indexing will be much more helpful for
- 00:10:23you in categorizing the kind of data
- 00:10:25that you want to search um the second
- 00:10:28thing is materialized views um I'll
- 00:10:31combine materialized views and the
- 00:10:33denormalization point into one uh the
- 00:10:35reason being uh if if any of you have
- 00:10:38done engineering here you would you
- 00:10:40would have studied database systems and
- 00:10:41one of the first things that that is
- 00:10:43taught to us is normalize normalize
- 00:10:45normalize everything right and when we
- 00:10:47come out we we come out with this with
- 00:10:49this idea that we need to
- 00:10:51normalize uh all of our data sets you'll
- 00:10:54realize that this works well on smaller
- 00:10:58data
- 00:10:59as the data grows those join queries
- 00:11:02will stop working those join queries
- 00:11:04will become so slow that there is
- 00:11:06absolutely nothing you can do to fix it
- 00:11:09so we took a conscious decision to
- 00:11:12denormalize a lot of our data sets so
- 00:11:15majority of our data sets majority of
- 00:11:17our tables have nothing to do with each
- 00:11:19other and we are okay with that it
- 00:11:21obviously leads to
- 00:11:23increase in the size of data that we
- 00:11:25store but the the trade-off that we get
- 00:11:29in improvement improvement of query is
- 00:11:31much higher than the size increase we
- 00:11:35can always Shard and make our database
- 00:11:37smaller or delete data or do whatever
- 00:11:39but query Improvement is a very
- 00:11:41difficult task to pull off uh if you if
- 00:11:44your entire query is a bunch of nested
- 00:11:46joints across uh two heavy tables we
- 00:11:50avoid that everywhere and one of the
- 00:11:52ways we avoid it is obviously as I said
- 00:11:53we denormalize a lot and we uh have
- 00:11:58materialized views
- 00:11:59everywhere in our system uh and that is
- 00:12:03one of the easiest cleanest fastest way
- 00:12:06to make your queries work faster if
- 00:12:09there is a bunch of small data set that
- 00:12:12is getting reused all over your
- 00:12:13postgress query multiple times over use
- 00:12:16width statements use materialized views
- 00:12:18and it will be uh your queries will
- 00:12:21automatically be fast I don't want to
- 00:12:23give you statistics about 10x fast or
- 00:12:2520x fast and all because it again
- 00:12:27depends upon data your query your server
- 00:12:29size all of those things so no no
- 00:12:32metrics as such being thrown here but it
- 00:12:35will have a much better experience than
- 00:12:37doing multiple joints across massive
- 00:12:39tables avoid that at all costs um one
- 00:12:43more thing is understanding your data
- 00:12:45better and by that I
- 00:12:48mean I feel like uh and this is
- 00:12:51something that I've learned after
- 00:12:52talking to a lot of people uh of
- 00:12:55different companies or uh different
- 00:12:57startups and how they work
- 00:12:59and they pick the database first and
- 00:13:02then they figure out how to put the data
- 00:13:03into the database I don't know why they
- 00:13:05do that maybe the stack looks more uh
- 00:13:08Rockstar like I guess uh if you choose
- 00:13:10some fancy database and then try to pige
- 00:13:12and hold the data into it uh picking
- 00:13:15first understanding the data then
- 00:13:18understanding how you will query the
- 00:13:19data should be the first step before you
- 00:13:22pick what kind of database and how you
- 00:13:25will uh design the schema of the
- 00:13:27database if you don't do that if if you
- 00:13:29say that you know what it's it's a
- 00:13:30postgress conference it's going to be
- 00:13:33just postgress in my stack there will be
- 00:13:34nothing else nowhere uh postgress is
- 00:13:38like the one true solution for
- 00:13:40everything so that's that's not going to
- 00:13:42work um then the next point is post is
- 00:13:46Db tuning around queries uh one more
- 00:13:50thing we have uh realized is many people
- 00:13:53tune the database this something that I
- 00:13:55came across again very recently while I
- 00:13:57was dealing with another company uh
- 00:13:59database stack they have tuned their
- 00:14:02database in in a wholesome manner that
- 00:14:04means that the entire database has a set
- 00:14:07of parameters that they have done PG
- 00:14:08tuning for uh and it caters to every
- 00:14:12single table that is there in database
- 00:14:13and that is a terrible approach if you
- 00:14:16have a lot of data a better way to do is
- 00:14:19you tune your D there's no denying that
- 00:14:22but you also tune your tables maybe a
- 00:14:24particular table needs more parallel
- 00:14:26workers maybe a particular table needs
- 00:14:29frequently vacuumed compared to the
- 00:14:31other set of tables that you have in
- 00:14:33your DB so um you need to you need to
- 00:14:37tune based upon the queries that hit
- 00:14:39those particular tables rather than the
- 00:14:40entire database in
- 00:14:42itself um the last I mean understanding
- 00:14:46a query planner I'm sure there is uh
- 00:14:49there's a mistake understanding a query
- 00:14:50planner so uh another mistake when I
- 00:14:54started out was I'm sure I don't know
- 00:14:57how many of you feel that way with a
- 00:14:59query planner of postgress or any
- 00:15:01database is a little hard to understand
- 00:15:04um and I felt that for the longest time
- 00:15:06I would it will just print a bunch of
- 00:15:08things and all I will read is the the
- 00:15:10last set of things right so it took this
- 00:15:13much time it accessed this much data and
- 00:15:16that's all I understood from those query
- 00:15:18planners took me a very long time to
- 00:15:21understand the direction of the query
- 00:15:23which is very very important to
- 00:15:25understand uh direction of the query
- 00:15:26would be what is called first a where
- 00:15:28clause and and Clause a join clause in
- 00:15:30your entire query if you do not
- 00:15:32understand that you will not be able to
- 00:15:33understand your query plan at all and
- 00:15:36it's very easy to understand a query
- 00:15:37plan of a simple query right if you do a
- 00:15:39select star from whatever table and
- 00:15:40fetch that you don't even need a query
- 00:15:42plan for that if the database is if the
- 00:15:45if there's if the index is not there
- 00:15:47that query will be slow you don't need a
- 00:15:48query plan to tell you that but query
- 00:15:51plan is super helpful when you're doing
- 00:15:53joints across multiple tables and uh
- 00:15:57understanding what kind of
- 00:15:59uh sorts are being called is very very
- 00:16:02important to understand I think me and
- 00:16:04Kash must have sat and debugged multiple
- 00:16:06queries trying to understand the query
- 00:16:08planner of it all and pogus is very
- 00:16:10funny with its query planning so uh
- 00:16:14there will be a certain clause in which
- 00:16:17a completely different query plan will
- 00:16:18be chosen for no reason at all and you
- 00:16:21have to and there have been reasons
- 00:16:22where we don't I still don't understand
- 00:16:24some of the query plans that are there
- 00:16:25but we have backtracked like into a into
- 00:16:28a explanation for ourselves that if we
- 00:16:31do this this this this then our query
- 00:16:33plans will look like this and if we do
- 00:16:35these set of things our query plans will
- 00:16:36look like that this is better than this
- 00:16:38we'll stick to this and we have followed
- 00:16:40that everywhere
- 00:16:42and I don't think I don't think you can
- 00:16:45look at a documentation and understand a
- 00:16:47query plan either this is something that
- 00:16:49you have to play around with your
- 00:16:50queries play around with your data to
- 00:16:52get to the point um the queries that I
- 00:16:55would have in my system on my set of
- 00:16:58data would have a you reduce a you
- 00:17:01reduce the data by half and the query
- 00:17:03plan will work very differently just the
- 00:17:05way postgress is and that is something
- 00:17:09that you have to respect you have to
- 00:17:11understand and if you don't understand
- 00:17:12query plan uh forget about optimizing
- 00:17:15your queries DB schema nothing nothing
- 00:17:17will ever happen you will just keep
- 00:17:18vacuuming which which brings me back to
- 00:17:20the last point and this is this is funny
- 00:17:24because I was in the vacuuming talk the
- 00:17:27one that happened right before for uh uh
- 00:17:30right before lunch break so the first
- 00:17:33thing he said was do not turn off autov
- 00:17:35vacuum the first thing I would say is
- 00:17:37turn off autov vacuum so uh and I'll
- 00:17:40tell you why we do that and why it works
- 00:17:43in our context and might not work for
- 00:17:45someone else autov vacuum is an
- 00:17:47incredible feature if tuned properly if
- 00:17:50you have seen the tuning parameters
- 00:17:52they're not very easy to understand what
- 00:17:54does delete tuples after X number of
- 00:17:57things even mean there
- 00:17:59they're not easy to what does nap time
- 00:18:02mean how does someone who has not dealt
- 00:18:05with database for a very long time
- 00:18:07understand the set of parameters there
- 00:18:08that is documentation and all of that
- 00:18:10but it's really hard to read an abstract
- 00:18:13documentation and relate it to a schema
- 00:18:17um we we played around with every single
- 00:18:20parameter that autov vacuum has nothing
- 00:18:22worked for us and I'll tell you why we
- 00:18:25would bulk we would bulk import billions
- 00:18:28of row in a fixed set of time now you
- 00:18:31might say that well if you are importing
- 00:18:33everything in a fixed set of time why
- 00:18:35don't you trigger why don't you write
- 00:18:37your autov vacuum to work right after
- 00:18:40the UT has been
- 00:18:41done that UT is never under our control
- 00:18:45the files can come delayed from anywhere
- 00:18:47any point point of time and because none
- 00:18:51of it is under our control we decided
- 00:18:54that autov vacuum is not a solution for
- 00:18:56us turned it off because it was it was
- 00:18:59going to run forever and ever and ever
- 00:19:02we vacuum uh so I hope most of you know
- 00:19:05the difference between vacuum full and
- 00:19:06vacuum analyze but if you don't know
- 00:19:07vacuum full a very simple explanation
- 00:19:10vacuum full will give you back your
- 00:19:11space that you have updated deleted
- 00:19:13vacuum analyze will improve your query
- 00:19:14plan we don't vacuum full anything
- 00:19:16because that completely blocks the DB we
- 00:19:18vacuum analyze all our queries right
- 00:19:20after doing a massive bulk UT uh we we
- 00:19:24realize that um I'm sure if you have
- 00:19:27been in the talk he spoke about Max
- 00:19:28parallel workers while autov vacuuming
- 00:19:30we understand that autov vacuuming uses
- 00:19:32the parallelism of postgress that is
- 00:19:34inbuilt into it which we don't but we
- 00:19:37don't really care about it because this
- 00:19:39happens late in the night
- 00:19:41and vacuuming taking half an hour more
- 00:19:44or 10 minutes more doesn't make a big
- 00:19:46difference for us at that point of time
- 00:19:48so in this context in this scenario
- 00:19:50turning off autov vacuum and running
- 00:19:52vacuum on our own as a script that
- 00:19:55triggers vacuum for multiple tables one
- 00:19:57after the other once are Imports are
- 00:19:59done works for us but to uh to reiterate
- 00:20:03again it might not work for your context
- 00:20:05and maybe autov vacuum is the better
- 00:20:07solution but remember that autov vacuum
- 00:20:09has a lot of pitfalls and I will I mean
- 00:20:13I read postgress 13 documentation a
- 00:20:16while back it still hasn't improved to
- 00:20:18an extent that I thought it should have
- 00:20:20by now and it still has its set of
- 00:20:23issues while dealing with massive sets
- 00:20:24of data um but I hope I hope it gets
- 00:20:27better over time and uh if if some if
- 00:20:30some code developers can do it then it
- 00:20:32has to be postest so I hope they do that
- 00:20:34so
- 00:20:37yeah um okay so this is another
- 00:20:40interesting part of the the talk I guess
- 00:20:44but before I get there um I remember
- 00:20:47speaking to someone outside and they
- 00:20:49said that how is your setup like and uh
- 00:20:52what do you do for um
- 00:20:55replica and Master Slave and all of of
- 00:20:58those set of things
- 00:20:59so I guess this will be triggering for
- 00:21:02everyone we don't have a Master Slave at
- 00:21:04all we don't have a replica either uh we
- 00:21:08have one database and one one node we
- 00:21:12have started it using foreign database
- 00:21:14rapper uh why we have shed it like that
- 00:21:17I will explain I'll get to that um but
- 00:21:20we have shed it using foreign database
- 00:21:21rapper so we have divided our data
- 00:21:23across multiple Financial years and kept
- 00:21:27older historical fin IAL years in a
- 00:21:29different uh database server and
- 00:21:32connected both of them using FDB and we
- 00:21:36query the primary DB and it figures out
- 00:21:38from the partitioning that the other the
- 00:21:42data is not in this database server
- 00:21:44right now and it is in the other
- 00:21:45database it figures it out fetches the
- 00:21:47query for us fetches the data for us um
- 00:21:51no slave setup uh our backups are
- 00:21:53archived in S3 we are okay this is by
- 00:21:57the way to reiterate this is uh a back
- 00:22:00office platform we do not promise that
- 00:22:03we'll have 100% off time we are okay
- 00:22:05with that we understand that if
- 00:22:08postgress goes down which has never ever
- 00:22:10happened again thankfully to postgress
- 00:22:12but even if it goes down for whatever
- 00:22:14number of reasons we have been able to
- 00:22:16bring that back up bring the database
- 00:22:18back up by using S3 within minutes and I
- 00:22:22have restarted postgress that is
- 00:22:24pointing towards a completely fresh
- 00:22:26backup from S3 with maybe 15 20
- 00:22:29terabytes of data under under a minute
- 00:22:32or two so it works so there is there
- 00:22:35there might be fancy complicated
- 00:22:37interesting setup to make your replicas
- 00:22:40work but this also works and I many
- 00:22:43people might call it jugar uh hacky way
- 00:22:46of doing things but I don't think it is
- 00:22:48I think it's a sensible approach we
- 00:22:50don't want to over engineer anything at
- 00:22:52all if this works why have a bunch of
- 00:22:55systems that you need to understand just
- 00:22:57to manage manage um a replica setup now
- 00:23:02coming back uh to the question of if we
- 00:23:05don't have a replica how do we load
- 00:23:07balance we don't but what we have done
- 00:23:11differently is that we have a we have a
- 00:23:14second postgress server that sits on top
- 00:23:17of our primary DB and acts like a
- 00:23:19caching layer we have uh we have an
- 00:23:23open-source uh piece of software called
- 00:23:25SQL Java which is a a sync uh job based
- 00:23:30mechanism that keeps pulling the DB and
- 00:23:33then fetches the data stores it in
- 00:23:34another postgress instance um and then
- 00:23:37eventually our app understands that the
- 00:23:41fetch is done data is ready to be served
- 00:23:43and it fetches the DV fetches the data
- 00:23:46from the caching layer so we end up
- 00:23:49creating
- 00:23:51around 500 GB worth of I would say
- 00:23:54around 20 30 millions of tables per day
- 00:23:58uh I remember speaking I remember asking
- 00:24:00someone postgress slack a long time back
- 00:24:03that we are doing this thing where we
- 00:24:05creating 20 million tables a day and
- 00:24:07they like why are you doing this isn't
- 00:24:09there another way of doing it and we're
- 00:24:11like no this works and the reason why we
- 00:24:13do this is because uh postgress in in
- 00:24:16itself supports sorting uh which red
- 00:24:18this doesn't postgress I mean at that
- 00:24:20point of time uh it it lets us do
- 00:24:23pagination it lets us do search on top
- 00:24:26of trading symbols if we need I mean
- 00:24:28search on top of any columns that we
- 00:24:29need to do if necessary so we have
- 00:24:32postgress setting as a caching layer on
- 00:24:34top of our primary postgress and all the
- 00:24:38queries first come to the SQL jobber
- 00:24:40application they go to our primary DB
- 00:24:44nothing gets hammered to the primary DB
- 00:24:45though so the primary DB is not under
- 00:24:47any load at any point of time I mean
- 00:24:49there is a query load but it's not
- 00:24:51getting hammered at all the hammering
- 00:24:53happens to this caching DB which gets
- 00:24:55set eventually at some point of time
- 00:24:57with the data
- 00:24:59and then we serve the data to to our end
- 00:25:02users and that remains for the entire
- 00:25:04day because as I said during the day the
- 00:25:06data doesn't change a lot so we can
- 00:25:08afford to cach this data for the entire
- 00:25:10time duration there are some instances
- 00:25:12in which we need to clear our C clear
- 00:25:15the cache we can just delete the key and
- 00:25:17then this entire process happens all
- 00:25:18over again for that particular user so
- 00:25:21that's our that's how our postgress
- 00:25:23caching layer is Works has worked fine
- 00:25:25for us every night we clean the 500gb so
- 00:25:28how we do is every night uh we have two
- 00:25:31500 GB discs uh pointing at to the
- 00:25:34server we switch from disk one to dis
- 00:25:36two then the disk one gets cleaned up
- 00:25:38then the next day goes to dis goes back
- 00:25:40from disk two to disk one and again the
- 00:25:42new tables are set all over it again
- 00:25:44works fine uh never been an issue with
- 00:25:47this um coming back to our learnings
- 00:25:50with postgress yeah sorry
- 00:26:05can hello are you able to hear me yeah
- 00:26:09yeah uh you know you're telling that
- 00:26:10about kite platform so uh from the kite
- 00:26:14platform data that is LP enement right
- 00:26:16so from the kite platform data you are
- 00:26:19porting to the console database yeah so
- 00:26:21that is a nightly job yeah that's a
- 00:26:23nightly job that's a nightly job yeah so
- 00:26:25that's what you telling in the console
- 00:26:28uh uh system that create millions of
- 00:26:30data right yeah so um okay maybe I
- 00:26:33should explain this again so uh you
- 00:26:35place your orders your trades and
- 00:26:37everything on kite right at the night we
- 00:26:40get a order book or a trade book that
- 00:26:42gets imported into console we do that to
- 00:26:46compute the buy average which is the
- 00:26:48average price at which you bought that
- 00:26:50stock or the profit and loss statement
- 00:26:53which you'll use for taxation for any
- 00:26:54other reason that you might need it for
- 00:26:56that is why we import data into console
- 00:26:59so to fetch these set of statements you
- 00:27:01have to come to console to fetch that
- 00:27:03now when you are fetching the statement
- 00:27:05we this caching layer sits on top of
- 00:27:07that your fetches go to this caching
- 00:27:10layer first it checks if there is
- 00:27:11already a prefetched cach for you ready
- 00:27:13or not if not the query goes through the
- 00:27:16DB the the data is fetched put into the
- 00:27:18caching layer and for the entire day the
- 00:27:20caching layer is serving the data to you
- 00:27:22not the primary DB the primary DB
- 00:27:24remains free at least for you as the
- 00:27:26user so let's say you come in you lay
- 00:27:28around in console you load a bunch of
- 00:27:29reports everything is cashed for the
- 00:27:31entire day in this caching layer so
- 00:27:33primary DB remains as it is till the
- 00:27:36till that night so that night we would
- 00:27:38have gotten all the trades orders any
- 00:27:40things that you have done on your
- 00:27:41trading platform into uh console we
- 00:27:44import all of that we clear our cash
- 00:27:46because it's a fresh set of data your
- 00:27:48pnl your Ledger your financial
- 00:27:50statements have changed because maybe
- 00:27:51you have traded that day maybe have
- 00:27:53bought stocks that day or anything would
- 00:27:54have done happened to your account that
- 00:27:55day so we clear our cach then next year
- 00:27:59when you come and fetch the data again
- 00:28:01all of this is set all over again and
- 00:28:03then whenever you can keep revisiting
- 00:28:04console keep fetching whatever amounts
- 00:28:06of data you want to it will come from
- 00:28:08this cache unless of course you change
- 00:28:10the the date parameters only then we uh
- 00:28:15uh go and fet the data from our primary
- 00:28:17DP but we have realized that most users
- 00:28:20use the data of a particular time frame
- 00:28:23and they don't they don't want to come
- 00:28:25and check for last 3 years what has
- 00:28:27happened it is always last 6 months last
- 00:28:282 months last one month and they check
- 00:28:31that once they go back and uh we cannot
- 00:28:36obviously build a system where every
- 00:28:38single date range has to be uh equally
- 00:28:41scalable and equally available uh we are
- 00:28:44very aware that the older which I'll
- 00:28:46talk about how we have shed we are very
- 00:28:48aware that our older Financial year data
- 00:28:50points don't need to be available all
- 00:28:52the time at the highest possible metrics
- 00:28:54of a server uh they don't have to be at
- 00:28:57a don't have to be served at a very fast
- 00:28:59rate either right so these are the
- 00:29:01decisions that we have taken and it has
- 00:29:03worked fine for us uh might not work for
- 00:29:05another person but yeah so I I hope that
- 00:29:09answered your question one doubt on that
- 00:29:11kite is also having postgress DB right
- 00:29:13so are using PG dump or postgress
- 00:29:16utilities itself uh no so kite uh uses
- 00:29:19postgress for its market watch uh so
- 00:29:22market watch is the place where you add
- 00:29:24different scripts or different stocks
- 00:29:27and it tells you the current price of
- 00:29:28the stock uh though we have we have
- 00:29:31plans of moving away from that to S DB
- 00:29:34um that has got nothing to do with this
- 00:29:37uh how I mean I guess you're asking a
- 00:29:39more of a how a broker Works question or
- 00:29:41how a trading platform works but you
- 00:29:43place an order the order comes as an
- 00:29:46exchange file at the end of the day for
- 00:29:48us and we import that so there is no PG
- 00:29:50dump that happens from kite to console
- 00:29:53so those are completely two different
- 00:29:54silos that have very little to do with
- 00:29:56each other they rarely share data among
- 00:29:58each other and they're never in the hot
- 00:30:00path because we understand that kite is
- 00:30:02a u extremely fast read and WR platform
- 00:30:05where everything has to happen over
- 00:30:06milliseconds and it can't ever go down
- 00:30:08so these set of fundamentals will not
- 00:30:10really work there so there is no
- 00:30:12connection of PG dumping kite data into
- 00:30:14console kite Works throughout the day
- 00:30:16console Works after the day after your
- 00:30:19trading Market is done that's where you
- 00:30:20come to console and check how well you
- 00:30:22have performed in the day so I that's
- 00:30:26that one more just caching layer do you
- 00:30:28have in kite also caching layer on kite
- 00:30:31yeah it's redis it's predominantly redis
- 00:30:34caching layer so we also use caching uh
- 00:30:37redis caching on everywhere actually
- 00:30:39it's not just kite we pretty much use
- 00:30:41redis like uh if you have used our
- 00:30:44platform coin uh it used to set on a $5
- 00:30:47digital ocean droplet for the longest
- 00:30:49possible time because everything was
- 00:30:50cached on a redis uh instance and used
- 00:30:53to work just fine so we use redis
- 00:30:56predominantly to cach we don't use redis
- 00:30:58in console for these kind of caching
- 00:31:00layer because sorting and pagination is
- 00:31:02not supported on it uh this is a very
- 00:31:05specific requirement here it works here
- 00:31:07so we use postgress here for
- 00:31:09that is it fine yeah that's I use this
- 00:31:12skyen console that's why I asked this
- 00:31:13cool no issues um thank
- 00:31:17you so our learnings with postgress and
- 00:31:21um I'll start off
- 00:31:23with how we because I I remember my my
- 00:31:27summary of my talk uh that is there on
- 00:31:29the posters and Etc outside talks about
- 00:31:32how we Shard and why we Shard the way we
- 00:31:34do it um if you have seen cus DB
- 00:31:38extension or a lot of sharding examples
- 00:31:40all over the world of all the DBS in the
- 00:31:43world how they set it up is have
- 00:31:47a have a have a master DB have a parent
- 00:31:51DB or whatever and have tenets to every
- 00:31:54single child that is connected to it now
- 00:31:58how those tenets uh work is that you
- 00:32:00query the master DB it figures out that
- 00:32:02these set of tenets are in this uh child
- 00:32:05setup or the sharded setup and the query
- 00:32:08goes there we believe that there is no
- 00:32:13reason to add another column that has
- 00:32:15these IDs on it we actually in most of
- 00:32:18our tables we have deleted all our IDs U
- 00:32:20extra data don't need it so we follow
- 00:32:23that in a lot of places so um what we
- 00:32:27did decided was was that we partition
- 00:32:29our database per month because it works
- 00:32:32for
- 00:32:32us then for every single Financial year
- 00:32:36we put it in a different database uh
- 00:32:38server and we connect it via FDB rapper
- 00:32:42and that is our entire sharded
- 00:32:45setup uh has worked fine for us but I
- 00:32:49would I would say
- 00:32:50that at our scale um and our scale
- 00:32:55is 30 40 terab of 50 terabytes you can
- 00:32:58say right now
- 00:33:01um it it's starting to falter a bit it's
- 00:33:04not it's not a great experience anymore
- 00:33:07and which is why we are moving to a very
- 00:33:09different setup different way of
- 00:33:11sharding maybe that is for another talk
- 00:33:13but till now we could scale uh to
- 00:33:16millions of users serving billions of
- 00:33:19requests uh 500 600 GBS of data per day
- 00:33:23using just foreign data rapper and a SQL
- 00:33:25jobber caching layer on top of our
- 00:33:27primary DB no nodes no uh load balancer
- 00:33:31nothing at all um so our learnings of
- 00:33:36postgress um has been that this is
- 00:33:39something that is a there is a gut
- 00:33:42feeling when you write your queries or
- 00:33:45when you write when you look at a
- 00:33:46database schema that uh our gut feeling
- 00:33:49is
- 00:33:50that every query has a time to it like
- 00:33:55for a particular amount of data for a
- 00:33:57particular query should not take more
- 00:33:58than x number of milliseconds I guess
- 00:33:59that comes with experience many of you
- 00:34:01can just look at the data look at the
- 00:34:03query and know that something is wrong
- 00:34:05if even if it's slow by a few
- 00:34:06milliseconds you can figure that out so
- 00:34:08we have a hard limit that certain
- 00:34:10queries cannot cross this limit and we
- 00:34:12optimize and keep on optimizing based on
- 00:34:15that um most of our heavy queries are in
- 00:34:17an async setup like the job or cash you
- 00:34:20said we ensure that none of it is on the
- 00:34:23hot path of an app um there is no glory
- 00:34:27in storing to too much data so we we
- 00:34:30delete a lot of data uh so someone was
- 00:34:32surprised that our total database is 50
- 00:34:35terabytes or um yeah probably around 50
- 00:34:39or 60 not more than that for sure um and
- 00:34:42one of the reasons why it is 50 and not
- 00:34:44500 terabytes is we delete a lot of data
- 00:34:47we do not believe in storing data that
- 00:34:50we do not need what what does it mean is
- 00:34:53that we uh for most of the computations
- 00:34:56that we do for most of the Imports and
- 00:34:58inserts and everything that we do we
- 00:35:00have a hot backup or whatever you can
- 00:35:03call it of the last 15 days or last 15
- 00:35:06days after that we have checkpoint
- 00:35:08backups of last one month last two
- 00:35:10months last 3 months one backup for each
- 00:35:12month we do not have any backup in
- 00:35:15between any of those dates because we
- 00:35:17can go back to any single month and
- 00:35:19regenerate everyday's data till now we
- 00:35:22are okay doing that because we have that
- 00:35:25a night where uh anything can go wrong
- 00:35:28and we can run these set of computations
- 00:35:30and come back to the current state that
- 00:35:32is right now maybe it doesn't work for
- 00:35:35others but I again this is another
- 00:35:38experience that I've learned looking at
- 00:35:39databases of others that there is a lot
- 00:35:41of frivolous data that people like to
- 00:35:42keep for no reason at all because it
- 00:35:44just makes the database looks bigger and
- 00:35:46I don't know makes it looks fancier just
- 00:35:48delete it it doesn't it's back it up in
- 00:35:51a S3 and put it somewhere like don't
- 00:35:53have to be in a database why does
- 00:35:54six-year-old data unless it's a
- 00:35:56compliance that is being set by the the
- 00:35:59company you work for or the organization
- 00:36:01you work for unless it's a compliance
- 00:36:02that you have to do it it can be an S3
- 00:36:05backup it can be a file um doesn't have
- 00:36:08to be in a database and you don't have
- 00:36:09to be responsible for every query of
- 00:36:12last 10 years to be served under 1
- 00:36:15millisecond doesn't make sense it will
- 00:36:17never scale don't do that um the other
- 00:36:22thing that I've also noticed is a lot of
- 00:36:24people write maybe this is a front end
- 00:36:27develop are going into backend issue uh
- 00:36:29where a lot of the logic that should
- 00:36:32have been done by postgress gets done by
- 00:36:34the app and I've noticed that in a lot
- 00:36:37of places and I think that is uh
- 00:36:39something that fundamentally should
- 00:36:41change post this in itself can do a lot
- 00:36:44of competitions like summing average
- 00:36:47window functions you can do so many
- 00:36:48things by overloading into postd rather
- 00:36:51than your app doing it um and I find
- 00:36:54that strange because your app should be
- 00:36:56responsible for just loading the queries
- 00:36:59fetching the data it should not be
- 00:37:01Computing for most of the scenarios I
- 00:37:03think I mean I don't know why this this
- 00:37:06this is something that we had done as a
- 00:37:08mistake too and we learned and I hope
- 00:37:11that uh maybe if there is only one
- 00:37:14learning from my entire talk uh because
- 00:37:16I've noticed this in a lot of places uh
- 00:37:20is to overload your postgress with most
- 00:37:22of the computations it can do it faster
- 00:37:24than any app that you write unless I
- 00:37:26don't know you using r or something else
- 00:37:28but still poist will be really fast so
- 00:37:30try to do that and um yeah uh as you
- 00:37:35would have noticed that our engineering
- 00:37:37setup is very lean we are it's not
- 00:37:41overwhelming or underwhelming it's stay
- 00:37:43whelmed I guess uh we we don't overdo
- 00:37:46anything at all we we
- 00:37:49always uh we always hit the limits of
- 00:37:52what we have right now in every possible
- 00:37:54way and only then look out for other
- 00:37:57Solutions
- 00:37:58and it has worked pretty good for us we
- 00:38:00have never over engineered any of our
- 00:38:03Solutions till now and we have always
- 00:38:05organically found solutions for whenever
- 00:38:07we have come across issues if postgress
- 00:38:09hasn't worked for us then that's fine
- 00:38:12we'll find another solution for it so as
- 00:38:15I said sometimes postgress is might not
- 00:38:17be the answer sometimes a different
- 00:38:18database would be the answer for you
- 00:38:20and you should be I guess humble enough
- 00:38:22to accept that and move on from
- 00:38:24postgress most databases are very
- 00:38:26similar to each other if you go through
- 00:38:28there how they design the data how the
- 00:38:30schemas are made unless you're dealing
- 00:38:32with columa databases they're very
- 00:38:34similar and this the the uh the
- 00:38:37fundamentals remain the same across all
- 00:38:40databases if they are not then that is a
- 00:38:42wrong database so even if so your route
- 00:38:45is experimenting with click house a lot
- 00:38:47and the fundamentals are very similar
- 00:38:49so do not be afraid to experiment with a
- 00:38:53different set of databases we all do
- 00:38:54that a lot in our free time uh because
- 00:38:57because I mean it's a strange way to I
- 00:38:59guess end the talk but post gu might not
- 00:39:01be an answer for every single problem
- 00:39:03though we found an answer for a lot of
- 00:39:04our problems and you should be okay with
- 00:39:07that uh thank
- 00:39:09[Applause]
- 00:39:14you hello um so even if the application
- 00:39:18users are you can have R inside post so
- 00:39:20that that that Sol the problem anyway
- 00:39:22but my question is uh when you say the
- 00:39:24caching layer has 20 million tables um
- 00:39:27do how do you take care of the catalog
- 00:39:29bloat or do you just drop and recreate
- 00:39:31the whole cluster every night we just
- 00:39:34rmrf the entire data okay Fant yeah
- 00:39:36that's what I was thinking the other
- 00:39:37problem is uh even with that um I've had
- 00:39:41scenarios where uh you run into extfs or
- 00:39:44whatever file system related limitations
- 00:39:46on because like poster stores everything
- 00:39:49in a single directory right so have you
- 00:39:51had hit something like that and if so
- 00:39:52what do you do yeah I mean
- 00:39:55U I would I would categorize it as some
- 00:39:58of the mistakes we did at the beginning
- 00:39:59the file limits were wrong at the to
- 00:40:01begin with but post that we' have never
- 00:40:03AC never really come across any file
- 00:40:05limit issues uh we have I mean more than
- 00:40:09happy to admit it we have come across
- 00:40:10issues where the we have run out of
- 00:40:12integers for our
- 00:40:14ID because that's a number of columns we
- 00:40:17stored in one single go that also has
- 00:40:18happened so uh and then the import
- 00:40:21stopped then we had to do a ridiculous
- 00:40:23amount of things alter you know how much
- 00:40:24time would have altering the table would
- 00:40:26have taken but but no we didn't uh it
- 00:40:29was a it was a server configuration
- 00:40:31mistake that from our end but it was
- 00:40:33never the issue of post so I haven't
- 00:40:36come across it in my experience okay
- 00:40:38thank
- 00:40:41you so you said you hardly have any
- 00:40:44crashes or any know downtime so is it
- 00:40:47with some kind of a ha solution or it's
- 00:40:50just you know the instance doesn't crash
- 00:40:52what's the magic oh what's the m i mean
- 00:40:55I think the magic is by the post
- 00:40:57developers no uh I think the reason we
- 00:41:00don't have a lot of Crash is we um we
- 00:41:05have ensured that all our apps are not
- 00:41:07sitting on top of massive databases
- 00:41:09they're always sitting on top of caching
- 00:41:10layers one uh you cannot ever ever ever
- 00:41:14scale an app on top of 10 20 terabytes
- 00:41:17of data and expect it to work without
- 00:41:18crashing it will crash if that happens
- 00:41:20it will overload and we have crashed our
- 00:41:22databases but the mistake was not of
- 00:41:24postgress that is wrong to expect that
- 00:41:26the mistake was that we thought our app
- 00:41:28can easily query that much data in this
- 00:41:30much amount of time and be fine with it
- 00:41:33it will never work as soon as we meet it
- 00:41:34asnc as as soon as we made it uh behind
- 00:41:37our caching layer it worked absolutely
- 00:41:39fine so it's uh again to there's the
- 00:41:42same answer it wasn't the issue of post
- 00:41:44it was our mistake that we had to
- 00:41:45rectify
- 00:41:51thanks okay so we'll take last questions
- 00:41:55after that you go offline questions
- 00:42:00uh this is regarding today's morning
- 00:42:02session right like kaas was addressing
- 00:42:05that uh before covid you could able to
- 00:42:07take uh 2 million request and during
- 00:42:11covid like you are able to scale up to 8
- 00:42:14million to 12 million uh without scaling
- 00:42:17your system how did that
- 00:42:19happen
- 00:42:22um okay um I'm going to S sound a little
- 00:42:25dumb here I guess but caching is a
- 00:42:27magical layer on top of everything I
- 00:42:29guess we were already ready to serve uh
- 00:42:32we did increase we did increase our
- 00:42:34primary DB servers uh the number of
- 00:42:36cores number of parallel workers that
- 00:42:38query the database all of those tuning
- 00:42:40had to change obviously now was it over
- 00:42:42provisioned uh no it was never
- 00:42:44over-provisioned it was always 1db so
- 00:42:46there is no over-provisioning 1 DB it's
- 00:42:47not like it was multi- sharded setup so
- 00:42:49it was 1db we added more cores to it the
- 00:42:52the jobber is a separate server that
- 00:42:54runs the the caching server that we call
- 00:42:56it right right so that was never
- 00:42:58over-provisioned that is still whatever
- 00:43:00we started with it's the exact same
- 00:43:02setup till now 16 CES 32 GB Ram still
- 00:43:04now and that's how we started three
- 00:43:06years back uh works fine um I don't know
- 00:43:10man the I guess that's how good the
- 00:43:12caching layer
- 00:43:13is uh you can say that probably we over
- 00:43:17proficient before that because when you
- 00:43:21we by default start with this 16 uh
- 00:43:23course 32 when you're dealing with a
- 00:43:25pogus DB because we are used to tuning
- 00:43:27it for that so we know the tuning
- 00:43:30parameters for those set of numbers so
- 00:43:32that's how we start off with that
- 00:43:33usually in that case maybe that's how we
- 00:43:35started here like that we thought that
- 00:43:36it would work fine have you ever
- 00:43:37forecasted that have you ever forecasted
- 00:43:40that load uh sorry I couldn't load load
- 00:43:42load tested uh yeah couple of times uh
- 00:43:45the maximum load that we have gone to uh
- 00:43:48was four or five uh and that's it it's
- 00:43:51never been more than that our post
- 00:43:53database has been overloaded multiple
- 00:43:54times and every single time it has been
- 00:43:57loaded has been our mistake where we
- 00:43:59have skipped the caching layer and hit
- 00:44:01the database directly and as I said that
- 00:44:03will never scale it doesn't matter if
- 00:44:04it's one terabyte or 500 GB it it will
- 00:44:06not work so we have every time we
- 00:44:09consciously write a new API endpoint we
- 00:44:11ensure that the first uh thing first
- 00:44:14Frontier has to be the caching layer on
- 00:44:16sitting on top of it and everything has
- 00:44:18to be async it cannot be concurrent uh
- 00:44:21it cannot be concurrent queries hitting
- 00:44:22the DB and uh an HTTP API endpoint
- 00:44:26waiting for the response to happen uh
- 00:44:28again that will not scale your app will
- 00:44:29go down for sure eventually everything
- 00:44:32will be in a weight IO situation and
- 00:44:33nothing will work thank you
- PostgreSQL
- Zerodha
- fintech
- basis data
- indeksing
- sharding
- vakum manual
- materialized views
- kinerja query
- lapisan cache