Database Tuning at Zerodha - India's Largest Stock Broker

00:44:36
https://www.youtube.com/watch?v=XB2lF_Z9cbs

Summary

TLDRPidato iki saka anggota tim teknik Zerodha, sing nerangake babagan pengalaman ing ngembangake lan ngatur basis data nggunakake PostgreSQL, khusus kanggo ngatur data finansial kritis. Dheweke nerangake bab carane sistem basa data Zerodha diatur, kalebu sharding data adhedhasar taun finansial lan panggunaan lapisan cache tambahan kanggo ngoptimalake kinerja basis data. Dheweke uga nekanake pentinge nemokake keseimbangan ing usage indexing lan materialized views kanggo ngilangi bottlenecks ing query basis data. Salah siji conto praktik apik sing diwenehake yaiku ngaktifake indeks parsial lan ora ngandelake indexing sing berlebihan. Dheweke uga nandhesake pentinge ngerti perencana query lan panggunaan denormalisasi kanggo ngelidiki bottlenecks kinerja. Pembicara nuduhake cara nuduhake pengalaman praktis sing bener-bener bisa ditindakake kanggo ngukur usaha sukses tanpa over-engineering solusi lan nyinggung keperluan kanggo fleksibel karo database.

Takeaways

  • 🍽️ Acara diadakan sawisé makan siang, njamin peserta supaya tetep waspodo.
  • 📉 Zerodha, broker saham fintech, nggunakake PostgreSQL kanggo ngatur data finansial kritis.
  • 🔍 Pengalaman pribadi lan pelajaran penting kanggo manajemen basis data.
  • 🗄️ Sharding adhedhasar taun finansial supaya data bisa ngatur beban.
  • 💾 Lapisan cache kanggo ngoptimalake pangolahan data sewenang-wenang.
  • 🤝 Puji kanggo komunitas PostgreSQL lan dokumentasi sing apik.
  • 🛠️ Praktik indexing parsial kanggo kinerja database lebih baik.
  • 🛡️ PostgreSQL dibuktekake ketahanan tanpa ngidini log file overflow.
  • 🚀 Materialized views lan denormalising kanggo ngoptimalake query.
  • 🔧 Konfigurasi manual vakum kanggo jadwal impor data irreguler.

Timeline

  • 00:00:00 - 00:05:00

    Pembicara miwiti presentasi marang basa subyek babagan carane Zerodha nggunakake Postgres kanggo ngatur data. Dheweke nyoroti manawa piwulang sing diduweni bisa uga ora bisa ditrapake kanggo pihak liya, nyatakake yen database kudu spesifik lan ora generik. Dheweke uga menehi ucapan terima kasih marang tim pengembangan Postgres amarga dokumentasi sing apik lan ketahanan saka Postgres.

  • 00:05:00 - 00:10:00

    Sejarah panggunaan Postgres ing Zerodha diwiwiti nalika perusahaan iki anyar dibangun lan ngimpor sekitar 150 MB data saben dina. Sawetara masalah sing diadhepi yaiku indexing berlebihan, file log overflow, lan crash, sing kabeh nyebabake akeh masalah sing kudu diatasi. Piwulang sing diduweni yaiku nyetel desain skema, ngoptimalake aplikasi supaya cocog karo database, lan cepet mutusake masalah kanggo paningkatan sing efisien.

  • 00:10:00 - 00:15:00

    Pembicara ngandhani babagan ngelola data gedhe, nyatakake yen istilah 'big data' kadhangkala ora relevan karo kabeh organisasi. Dheweke mromosikaken pentingé panggunaane index sing selektif, denormalisasi data supaya querying luwih gampang, lan nggunakake views materialisasi kanggo kinerja sing luwih apik.

  • 00:15:00 - 00:20:00

    Manajemen query lan optimisasi dadi topik utama, kanthi eksplorasi panggunaan indexing sing selektif, denormalisasi lan views materialisasi supaya queries liyane efisien. Pembicara ngandani babagan pentingé ngerti planner query lan nggawe tuning database sing fokus marang tabel individu lan queries tartamtu.

  • 00:20:00 - 00:25:00

    Ngerti planner query lan pigura data luwih jero ningkatake kinerja database. Dheweke uga ngomong babagan eksperimen karo autovacuum Postgres, biasane nyimpulake supaya dipateni lan dikelola kanthi manual, amarga kadang ora bisa ditrapake ing konteks khusus. Nawakake nostalgia amarga piwulang iki dijupuk liwat pengalaman.

  • 00:25:00 - 00:30:00

    Postgres uga digunakake ing caching ing Zerodha kanthi nggawe database sekunder kanggo caching data sing akeh diakses. Implementasi iki ngidini sistem utama ora akeh beban, lan iki dadi struktur caching transaction supaya optimalisasi data cepet ing wektu dagang.

  • 00:30:00 - 00:35:00

    Presentasi njelasake struktur caching lan proses query sing digunakake kanggo ngurangi beban ing database utama nalika nggampangake para pedagang. Pembicara menerangake cara partitioning data adhedhasar taun fiskal lan kegunaan database asing kanggo manajemen data skala gedhe.

  • 00:35:00 - 00:44:36

    Pembicara tutup presentasi kanthi nekanake kekarepan kanggo nguji lan nggunakake solusi database sing paling apik kanggo konteks tartamtu, ora bias banget marang Postgres utawa sistem liyane. Dheweke nuduhake yen sawetara solusi sing ora di-overengineer mujudake kemajuan sing signifikan lan stabilitas kanggo Zerodha.

Show more

Mind Map

Video Q&A

  • Apa Zerodha iku?

    Zerodha yaiku broker saham finansial teknologi India, kadhangkala diarani Robin Hood saka India.

  • Kapan pasar saham dibuka ing India?

    Pasar saham ing India dibuka saka jam 9:15 nganti 3:30 sore saben dina.

  • Apa sing nggawe PostgreSQL istimewa miturut pembicara?

    PostgreSQL dikenal amarga dokumentasi sing apik lan komunitas sing ramah, uga kekuwatan lan ketahanan panggunaan kanggo data finansial kritis.

  • Kepiye cara Zerodha nyimpen lan ngimpor buku perdagangan?

    Zerodha ngimpor buku perdagangan saben wengi kanggo ngitung informasi finansial kayata rugi laba lan buku besar.

  • Apa jensi lapisan cache sing digunakake Zerodha?

    Zerodha nggunakake lapisan cache basis data PostgreSQL tambahan sing disinkronake karo basis data utama kanggo nyimpen data sauntara.

  • Apa sebabane Zerodha ora luwih milih replikasi DB tradisional?

    Zerodha ora nganggo replikasi Master-Slave nanging nggunakake lisensi backup lan simpenan arsip data ing S3 supaya bisa mbalekake kanthi cepet yen ana masalah.

  • Kepiye carane Zerodha ngatur sharding data?

    Zerodha sekat basis data adhedhasar taunan finansial kanthi nggunakake pengiris data asing kanggo nyambungake panggunaan data antar server.

  • Kenapa Zerodha mutusake kanggo ora ngaktifake autovakum kanthi otomatis ing PostgreSQL?

    Autovakum dikonfigurasi manual amarga jadwal impor data sing ora teratur ora ngidini autovakum otomatis kanggo ngepel. ja uga mbantu ngatur sumber daya server kanthi cepet.

  • Apa peran konsol ing sistem Zerodha?

    Konsol iku platform backend sing ngimpor lan nyimpen buku perdagangan kanggo ngitung informasi finansial kayata rugi laba lan buku besar pangguna.

  • Kepiye Zerodha nggunakake indeksing ing PostgreSQL?

    Zerodha nggunakake indeks parsial lan ora overindexing supaya bisa tetep efisien ing pamrosesan query.

View more video summaries

Get instant access to free YouTube video summaries powered by AI!
Subtitles
en
Auto Scroll:
  • 00:00:00
    okay um first of all uh good afternoon
  • 00:00:03
    everyone uh I hope the lunch was good uh
  • 00:00:06
    but obviously not too good so that you
  • 00:00:08
    don't sleep off while I give this talk
  • 00:00:11
    uh and welcome to my presentation of how
  • 00:00:14
    we use postris in
  • 00:00:16
    zeroda uh and what what have we learned
  • 00:00:19
    from it and our mistakes our experiences
  • 00:00:22
    everything and where we are right
  • 00:00:25
    now so setting up the context for the
  • 00:00:28
    talk um to quot my favorite Salman B
  • 00:00:32
    movie race three our learnings are our
  • 00:00:34
    learnings none of your
  • 00:00:36
    learnings uh what it means is that
  • 00:00:38
    everything that I'm going to speak about
  • 00:00:40
    here is something that we have learned
  • 00:00:42
    in our experience in the context of the
  • 00:00:45
    data that how we use how we import it
  • 00:00:47
    might not apply to even one person in
  • 00:00:50
    this room and that is how databases
  • 00:00:52
    should be it it should not be extremely
  • 00:00:54
    generic either um you might disagree or
  • 00:00:57
    be triggered by how we news postgress
  • 00:01:01
    and that is okay I have been told by uh
  • 00:01:04
    Kash our CTO to not make any jokes even
  • 00:01:07
    pg3
  • 00:01:10
    months uh little bit about me uh I've
  • 00:01:12
    been at zeroda for since day one of Tech
  • 00:01:15
    Team um as the 10x engineer uh these are
  • 00:01:19
    all memes that we have internally about
  • 00:01:21
    each other and been managing the
  • 00:01:24
    backend uh full stack backend for the
  • 00:01:28
    entire time I've been at zeroda gone
  • 00:01:31
    through all possible databases um my SQL
  • 00:01:35
    postest redis uh mongod DV uh click
  • 00:01:40
    house cockroach the list is
  • 00:01:43
    endless uh and and before I get into
  • 00:01:47
    this talk I first of all like to I mean
  • 00:01:50
    say thanks to the core team of postgress
  • 00:01:54
    uh because I've come across multiple
  • 00:01:56
    languages databases softwares Force or
  • 00:02:00
    Enterprise but I don't think there has
  • 00:02:03
    been anyone better at
  • 00:02:06
    documenting their features as well as
  • 00:02:08
    postgress has done I don't think there
  • 00:02:10
    is anyone that has a better blueprint of
  • 00:02:13
    what they want to do in their future
  • 00:02:15
    updates like postgress has done I don't
  • 00:02:17
    think there is I might be wrong here
  • 00:02:19
    again because as I said it's our
  • 00:02:21
    learnings but I don't think there is
  • 00:02:22
    anything as resilient as postrace has
  • 00:02:24
    been for us um and we have done
  • 00:02:28
    ridiculous things with it and this just
  • 00:02:30
    worked uh from upgrading from postgress
  • 00:02:34
    8 is where we started to postgress uh we
  • 00:02:37
    right now at
  • 00:02:38
    PG-13 uh and it has the updation has
  • 00:02:42
    never caused an issue no data loss
  • 00:02:44
    nothing and that is U like cannot be
  • 00:02:48
    more thankful to the code development
  • 00:02:50
    team of postgress and the community of
  • 00:02:52
    postgress which has always been super
  • 00:02:54
    nice in answering any of our doubts on
  • 00:02:56
    the slack
  • 00:02:57
    channels so
  • 00:03:00
    uh history of uh postgress usage in
  • 00:03:03
    seroa we started out uh first of all let
  • 00:03:06
    me set a bit of context of how zeroda
  • 00:03:09
    imports or uses its data and maybe that
  • 00:03:11
    will be helpful in understanding why we
  • 00:03:13
    do things with postris the way we do
  • 00:03:16
    it the as you know V zeroda is an
  • 00:03:21
    fintech Indian broker maybe I should
  • 00:03:23
    have introduce zeroda first I don't
  • 00:03:24
    think everyone knows uh what zeroda is
  • 00:03:27
    so we are a stock broker uh we uh Robin
  • 00:03:32
    Hood of India or Robin Hood is zeroda of
  • 00:03:33
    us uh we deal with stock market and uh
  • 00:03:38
    we
  • 00:03:40
    import trade books we basically build a
  • 00:03:43
    software for people to trade on so which
  • 00:03:44
    means that we have to deal with all
  • 00:03:46
    kinds of financial information and it
  • 00:03:48
    also means Computing a lot of financial
  • 00:03:50
    information like pnl of uh profit and
  • 00:03:53
    loss of users how much The Ledger of
  • 00:03:55
    users how much money they have
  • 00:03:57
    transferred in transferred out all
  • 00:03:58
    critical financial information that we
  • 00:04:00
    store and we use postgress for
  • 00:04:02
    it uh markets are open from 9:15 to 3
  • 00:04:08
    3:30 every day after that M6 is open but
  • 00:04:11
    I don't think we ever cared a lot about
  • 00:04:14
    it but uh yeah markets are open from
  • 00:04:16
    9:15 to 330 for majority for most of our
  • 00:04:19
    Traders um and we our systems that we
  • 00:04:24
    have built some of them are read only
  • 00:04:26
    throughout the day and become right only
  • 00:04:29
    at night
  • 00:04:30
    many of many of the systems that are
  • 00:04:32
    built are usually read and write
  • 00:04:34
    throughout the day and night but our
  • 00:04:36
    systems are a bit different than that
  • 00:04:37
    and the systems that I have worked on uh
  • 00:04:40
    we have a trading platform called kite
  • 00:04:43
    uh which has a transactional DB which
  • 00:04:46
    again uses postris that is a read write
  • 00:04:48
    throughout the day but console which is
  • 00:04:51
    our backend back office platform where
  • 00:04:54
    all the trade books all the information
  • 00:04:56
    regarding anything the user has done
  • 00:04:58
    throughout the day on our Trading
  • 00:04:59
    platform gets imported in that import
  • 00:05:03
    happens at night that is the rights of
  • 00:05:05
    bulk rights happen at night but majority
  • 00:05:08
    of it it remains a readon platform
  • 00:05:10
    throughout the day with very few rights
  • 00:05:12
    so that is the context on which we built
  • 00:05:15
    our schemas our queries our databases
  • 00:05:17
    and how we
  • 00:05:19
    scale so uh we started off with
  • 00:05:21
    importing around uh so when I joined
  • 00:05:24
    zeroa used to have 20,000 clients um not
  • 00:05:28
    even all of them are active and we used
  • 00:05:30
    to import around 150 MBS of data per day
  • 00:05:35
    at best and I used to have uh I am
  • 00:05:40
    saying I a lot here because at point of
  • 00:05:42
    time it was just two or three of us uh I
  • 00:05:44
    mean if you have read our blogs you
  • 00:05:46
    would know that we are a very lean very
  • 00:05:47
    small team and we still have remained so
  • 00:05:50
    like that so I used to face a lot of
  • 00:05:52
    issues with scaling even that 100 MB of
  • 00:05:55
    data when we started out with um when I
  • 00:05:59
    look back back at it lot of things that
  • 00:06:00
    I did was extremely obviously dumb uh
  • 00:06:03
    lack of understanding of how data Works
  • 00:06:05
    understanding of how databases work um
  • 00:06:08
    over indexing issues under indexing
  • 00:06:11
    everything every possible thing that you
  • 00:06:12
    can think of can go wrong in a database
  • 00:06:15
    um for example let's say uh the log
  • 00:06:18
    files overflowing and causing the
  • 00:06:21
    database to crash so everything that can
  • 00:06:24
    possibly go wrong uh has gone wrong with
  • 00:06:26
    us we have learned from it uh We've
  • 00:06:28
    improved our softwares way we deal with
  • 00:06:32
    uh storing our own data so started off
  • 00:06:35
    with 100 MB uh 100 MB failed uh there
  • 00:06:38
    was postgress 8 uh improved on our
  • 00:06:41
    schemas improved our schema design
  • 00:06:43
    improved the way an app has to built on
  • 00:06:46
    has to be built on top of um our
  • 00:06:49
    databases not rewrote our apps multiple
  • 00:06:51
    times uh again if you have read any of
  • 00:06:54
    our posts you would know that we we
  • 00:06:56
    rewrite a lot of our things multiple
  • 00:06:59
    times over over and over again um it is
  • 00:07:02
    mundan might be but it solves it solves
  • 00:07:05
    a lot of headache for us by removing uh
  • 00:07:08
    Legacy code Legacy issues and I would
  • 00:07:11
    say Legacy schemas too because you might
  • 00:07:13
    have started with a schema that doesn't
  • 00:07:16
    make sense right now uh because your
  • 00:07:18
    queries have changed the way you deal
  • 00:07:19
    with the data has changed so we end up
  • 00:07:23
    rewriting everything we know that
  • 00:07:24
    nothing is constant Everything Will
  • 00:07:26
    Change needs to change everything will
  • 00:07:27
    break and that's okay we are okay with
  • 00:07:29
    it uh we currently deal with hundreds of
  • 00:07:32
    GBS of import every single day um uh
  • 00:07:36
    absolutely no issues at all I mean there
  • 00:07:38
    are plenty of issues but postgress has
  • 00:07:40
    worked fine for us till now though we
  • 00:07:44
    have other plans of doing other things
  • 00:07:45
    with it but till now again nothing as
  • 00:07:49
    resilient as good as postgress has been
  • 00:07:52
    for us so how do we
  • 00:07:56
    manage uh this big amount of data I've
  • 00:08:00
    put a question mark there
  • 00:08:01
    because when we when we started out um
  • 00:08:06
    understanding our data better I remember
  • 00:08:08
    this was six years back probably I
  • 00:08:11
    remember sitting with Nan our CEO and
  • 00:08:13
    even Kash and Nan used to be like so so
  • 00:08:16
    we are very close to Big Data right
  • 00:08:18
    because big data used to be this fancy
  • 00:08:21
    term at that point of time I never
  • 00:08:23
    understood what Big Data meant uh I
  • 00:08:26
    assumed that it's just a nice looking
  • 00:08:28
    term on your assume right you you're
  • 00:08:30
    you're managing Big Data um eventually
  • 00:08:33
    we uh eventually I guess we all realize
  • 00:08:37
    that all of that is pretty much hogwash
  • 00:08:39
    uh there are companies which need big
  • 00:08:42
    data there are companies which don't
  • 00:08:43
    need big data you don't have to be a
  • 00:08:46
    serious engineering company if you I
  • 00:08:48
    mean if you don't need to have big data
  • 00:08:50
    to be a serious engineering company you
  • 00:08:52
    can make do with little less data so um
  • 00:08:56
    I'm going to be this talk is probably
  • 00:08:58
    going to be a bit of an over overview of
  • 00:09:00
    how we manage our data till now but um I
  • 00:09:03
    glad to I'll be more than glad to take
  • 00:09:05
    questions at the end of it if there are
  • 00:09:07
    more doubts or anything else uh first
  • 00:09:10
    thing is uh index uh but don't overdo it
  • 00:09:14
    so when we started
  • 00:09:16
    out I I thought that indexing was like a
  • 00:09:19
    fullprof plan to solve everything that
  • 00:09:22
    is there realized it much later that
  • 00:09:24
    indexing itself takes a lot of space
  • 00:09:27
    indexing in itself uh uh you can't index
  • 00:09:31
    for every query that you write you need
  • 00:09:33
    to First understand that there are some
  • 00:09:35
    queries that need to be fast and some
  • 00:09:36
    queries that you can afford it to be
  • 00:09:38
    slow and that's okay so how we have
  • 00:09:42
    designed our systems is the queries that
  • 00:09:44
    um
  • 00:09:46
    are the the the number of queries are
  • 00:09:49
    higher for let's say a particular set of
  • 00:09:50
    columns those columns are indexed and uh
  • 00:09:54
    the columns that are not indexed they
  • 00:09:56
    might be queried and but we don't index
  • 00:09:58
    them at all and that's okay those
  • 00:10:00
    queries might take a long enough long
  • 00:10:02
    time but they're not user facing they
  • 00:10:05
    are backend reports that it generated
  • 00:10:07
    over time not everything has to happen
  • 00:10:09
    in 1 second or half a millisecond or
  • 00:10:11
    stuff like that so we're very aware of
  • 00:10:12
    that when we index we use partial
  • 00:10:14
    indexes everywhere U that's another
  • 00:10:16
    thing that we learned that uh even if
  • 00:10:18
    you're indexing a column you can partial
  • 00:10:21
    indexing will be much more helpful for
  • 00:10:23
    you in categorizing the kind of data
  • 00:10:25
    that you want to search um the second
  • 00:10:28
    thing is materialized views um I'll
  • 00:10:31
    combine materialized views and the
  • 00:10:33
    denormalization point into one uh the
  • 00:10:35
    reason being uh if if any of you have
  • 00:10:38
    done engineering here you would you
  • 00:10:40
    would have studied database systems and
  • 00:10:41
    one of the first things that that is
  • 00:10:43
    taught to us is normalize normalize
  • 00:10:45
    normalize everything right and when we
  • 00:10:47
    come out we we come out with this with
  • 00:10:49
    this idea that we need to
  • 00:10:51
    normalize uh all of our data sets you'll
  • 00:10:54
    realize that this works well on smaller
  • 00:10:58
    data
  • 00:10:59
    as the data grows those join queries
  • 00:11:02
    will stop working those join queries
  • 00:11:04
    will become so slow that there is
  • 00:11:06
    absolutely nothing you can do to fix it
  • 00:11:09
    so we took a conscious decision to
  • 00:11:12
    denormalize a lot of our data sets so
  • 00:11:15
    majority of our data sets majority of
  • 00:11:17
    our tables have nothing to do with each
  • 00:11:19
    other and we are okay with that it
  • 00:11:21
    obviously leads to
  • 00:11:23
    increase in the size of data that we
  • 00:11:25
    store but the the trade-off that we get
  • 00:11:29
    in improvement improvement of query is
  • 00:11:31
    much higher than the size increase we
  • 00:11:35
    can always Shard and make our database
  • 00:11:37
    smaller or delete data or do whatever
  • 00:11:39
    but query Improvement is a very
  • 00:11:41
    difficult task to pull off uh if you if
  • 00:11:44
    your entire query is a bunch of nested
  • 00:11:46
    joints across uh two heavy tables we
  • 00:11:50
    avoid that everywhere and one of the
  • 00:11:52
    ways we avoid it is obviously as I said
  • 00:11:53
    we denormalize a lot and we uh have
  • 00:11:58
    materialized views
  • 00:11:59
    everywhere in our system uh and that is
  • 00:12:03
    one of the easiest cleanest fastest way
  • 00:12:06
    to make your queries work faster if
  • 00:12:09
    there is a bunch of small data set that
  • 00:12:12
    is getting reused all over your
  • 00:12:13
    postgress query multiple times over use
  • 00:12:16
    width statements use materialized views
  • 00:12:18
    and it will be uh your queries will
  • 00:12:21
    automatically be fast I don't want to
  • 00:12:23
    give you statistics about 10x fast or
  • 00:12:25
    20x fast and all because it again
  • 00:12:27
    depends upon data your query your server
  • 00:12:29
    size all of those things so no no
  • 00:12:32
    metrics as such being thrown here but it
  • 00:12:35
    will have a much better experience than
  • 00:12:37
    doing multiple joints across massive
  • 00:12:39
    tables avoid that at all costs um one
  • 00:12:43
    more thing is understanding your data
  • 00:12:45
    better and by that I
  • 00:12:48
    mean I feel like uh and this is
  • 00:12:51
    something that I've learned after
  • 00:12:52
    talking to a lot of people uh of
  • 00:12:55
    different companies or uh different
  • 00:12:57
    startups and how they work
  • 00:12:59
    and they pick the database first and
  • 00:13:02
    then they figure out how to put the data
  • 00:13:03
    into the database I don't know why they
  • 00:13:05
    do that maybe the stack looks more uh
  • 00:13:08
    Rockstar like I guess uh if you choose
  • 00:13:10
    some fancy database and then try to pige
  • 00:13:12
    and hold the data into it uh picking
  • 00:13:15
    first understanding the data then
  • 00:13:18
    understanding how you will query the
  • 00:13:19
    data should be the first step before you
  • 00:13:22
    pick what kind of database and how you
  • 00:13:25
    will uh design the schema of the
  • 00:13:27
    database if you don't do that if if you
  • 00:13:29
    say that you know what it's it's a
  • 00:13:30
    postgress conference it's going to be
  • 00:13:33
    just postgress in my stack there will be
  • 00:13:34
    nothing else nowhere uh postgress is
  • 00:13:38
    like the one true solution for
  • 00:13:40
    everything so that's that's not going to
  • 00:13:42
    work um then the next point is post is
  • 00:13:46
    Db tuning around queries uh one more
  • 00:13:50
    thing we have uh realized is many people
  • 00:13:53
    tune the database this something that I
  • 00:13:55
    came across again very recently while I
  • 00:13:57
    was dealing with another company uh
  • 00:13:59
    database stack they have tuned their
  • 00:14:02
    database in in a wholesome manner that
  • 00:14:04
    means that the entire database has a set
  • 00:14:07
    of parameters that they have done PG
  • 00:14:08
    tuning for uh and it caters to every
  • 00:14:12
    single table that is there in database
  • 00:14:13
    and that is a terrible approach if you
  • 00:14:16
    have a lot of data a better way to do is
  • 00:14:19
    you tune your D there's no denying that
  • 00:14:22
    but you also tune your tables maybe a
  • 00:14:24
    particular table needs more parallel
  • 00:14:26
    workers maybe a particular table needs
  • 00:14:29
    frequently vacuumed compared to the
  • 00:14:31
    other set of tables that you have in
  • 00:14:33
    your DB so um you need to you need to
  • 00:14:37
    tune based upon the queries that hit
  • 00:14:39
    those particular tables rather than the
  • 00:14:40
    entire database in
  • 00:14:42
    itself um the last I mean understanding
  • 00:14:46
    a query planner I'm sure there is uh
  • 00:14:49
    there's a mistake understanding a query
  • 00:14:50
    planner so uh another mistake when I
  • 00:14:54
    started out was I'm sure I don't know
  • 00:14:57
    how many of you feel that way with a
  • 00:14:59
    query planner of postgress or any
  • 00:15:01
    database is a little hard to understand
  • 00:15:04
    um and I felt that for the longest time
  • 00:15:06
    I would it will just print a bunch of
  • 00:15:08
    things and all I will read is the the
  • 00:15:10
    last set of things right so it took this
  • 00:15:13
    much time it accessed this much data and
  • 00:15:16
    that's all I understood from those query
  • 00:15:18
    planners took me a very long time to
  • 00:15:21
    understand the direction of the query
  • 00:15:23
    which is very very important to
  • 00:15:25
    understand uh direction of the query
  • 00:15:26
    would be what is called first a where
  • 00:15:28
    clause and and Clause a join clause in
  • 00:15:30
    your entire query if you do not
  • 00:15:32
    understand that you will not be able to
  • 00:15:33
    understand your query plan at all and
  • 00:15:36
    it's very easy to understand a query
  • 00:15:37
    plan of a simple query right if you do a
  • 00:15:39
    select star from whatever table and
  • 00:15:40
    fetch that you don't even need a query
  • 00:15:42
    plan for that if the database is if the
  • 00:15:45
    if there's if the index is not there
  • 00:15:47
    that query will be slow you don't need a
  • 00:15:48
    query plan to tell you that but query
  • 00:15:51
    plan is super helpful when you're doing
  • 00:15:53
    joints across multiple tables and uh
  • 00:15:57
    understanding what kind of
  • 00:15:59
    uh sorts are being called is very very
  • 00:16:02
    important to understand I think me and
  • 00:16:04
    Kash must have sat and debugged multiple
  • 00:16:06
    queries trying to understand the query
  • 00:16:08
    planner of it all and pogus is very
  • 00:16:10
    funny with its query planning so uh
  • 00:16:14
    there will be a certain clause in which
  • 00:16:17
    a completely different query plan will
  • 00:16:18
    be chosen for no reason at all and you
  • 00:16:21
    have to and there have been reasons
  • 00:16:22
    where we don't I still don't understand
  • 00:16:24
    some of the query plans that are there
  • 00:16:25
    but we have backtracked like into a into
  • 00:16:28
    a explanation for ourselves that if we
  • 00:16:31
    do this this this this then our query
  • 00:16:33
    plans will look like this and if we do
  • 00:16:35
    these set of things our query plans will
  • 00:16:36
    look like that this is better than this
  • 00:16:38
    we'll stick to this and we have followed
  • 00:16:40
    that everywhere
  • 00:16:42
    and I don't think I don't think you can
  • 00:16:45
    look at a documentation and understand a
  • 00:16:47
    query plan either this is something that
  • 00:16:49
    you have to play around with your
  • 00:16:50
    queries play around with your data to
  • 00:16:52
    get to the point um the queries that I
  • 00:16:55
    would have in my system on my set of
  • 00:16:58
    data would have a you reduce a you
  • 00:17:01
    reduce the data by half and the query
  • 00:17:03
    plan will work very differently just the
  • 00:17:05
    way postgress is and that is something
  • 00:17:09
    that you have to respect you have to
  • 00:17:11
    understand and if you don't understand
  • 00:17:12
    query plan uh forget about optimizing
  • 00:17:15
    your queries DB schema nothing nothing
  • 00:17:17
    will ever happen you will just keep
  • 00:17:18
    vacuuming which which brings me back to
  • 00:17:20
    the last point and this is this is funny
  • 00:17:24
    because I was in the vacuuming talk the
  • 00:17:27
    one that happened right before for uh uh
  • 00:17:30
    right before lunch break so the first
  • 00:17:33
    thing he said was do not turn off autov
  • 00:17:35
    vacuum the first thing I would say is
  • 00:17:37
    turn off autov vacuum so uh and I'll
  • 00:17:40
    tell you why we do that and why it works
  • 00:17:43
    in our context and might not work for
  • 00:17:45
    someone else autov vacuum is an
  • 00:17:47
    incredible feature if tuned properly if
  • 00:17:50
    you have seen the tuning parameters
  • 00:17:52
    they're not very easy to understand what
  • 00:17:54
    does delete tuples after X number of
  • 00:17:57
    things even mean there
  • 00:17:59
    they're not easy to what does nap time
  • 00:18:02
    mean how does someone who has not dealt
  • 00:18:05
    with database for a very long time
  • 00:18:07
    understand the set of parameters there
  • 00:18:08
    that is documentation and all of that
  • 00:18:10
    but it's really hard to read an abstract
  • 00:18:13
    documentation and relate it to a schema
  • 00:18:17
    um we we played around with every single
  • 00:18:20
    parameter that autov vacuum has nothing
  • 00:18:22
    worked for us and I'll tell you why we
  • 00:18:25
    would bulk we would bulk import billions
  • 00:18:28
    of row in a fixed set of time now you
  • 00:18:31
    might say that well if you are importing
  • 00:18:33
    everything in a fixed set of time why
  • 00:18:35
    don't you trigger why don't you write
  • 00:18:37
    your autov vacuum to work right after
  • 00:18:40
    the UT has been
  • 00:18:41
    done that UT is never under our control
  • 00:18:45
    the files can come delayed from anywhere
  • 00:18:47
    any point point of time and because none
  • 00:18:51
    of it is under our control we decided
  • 00:18:54
    that autov vacuum is not a solution for
  • 00:18:56
    us turned it off because it was it was
  • 00:18:59
    going to run forever and ever and ever
  • 00:19:02
    we vacuum uh so I hope most of you know
  • 00:19:05
    the difference between vacuum full and
  • 00:19:06
    vacuum analyze but if you don't know
  • 00:19:07
    vacuum full a very simple explanation
  • 00:19:10
    vacuum full will give you back your
  • 00:19:11
    space that you have updated deleted
  • 00:19:13
    vacuum analyze will improve your query
  • 00:19:14
    plan we don't vacuum full anything
  • 00:19:16
    because that completely blocks the DB we
  • 00:19:18
    vacuum analyze all our queries right
  • 00:19:20
    after doing a massive bulk UT uh we we
  • 00:19:24
    realize that um I'm sure if you have
  • 00:19:27
    been in the talk he spoke about Max
  • 00:19:28
    parallel workers while autov vacuuming
  • 00:19:30
    we understand that autov vacuuming uses
  • 00:19:32
    the parallelism of postgress that is
  • 00:19:34
    inbuilt into it which we don't but we
  • 00:19:37
    don't really care about it because this
  • 00:19:39
    happens late in the night
  • 00:19:41
    and vacuuming taking half an hour more
  • 00:19:44
    or 10 minutes more doesn't make a big
  • 00:19:46
    difference for us at that point of time
  • 00:19:48
    so in this context in this scenario
  • 00:19:50
    turning off autov vacuum and running
  • 00:19:52
    vacuum on our own as a script that
  • 00:19:55
    triggers vacuum for multiple tables one
  • 00:19:57
    after the other once are Imports are
  • 00:19:59
    done works for us but to uh to reiterate
  • 00:20:03
    again it might not work for your context
  • 00:20:05
    and maybe autov vacuum is the better
  • 00:20:07
    solution but remember that autov vacuum
  • 00:20:09
    has a lot of pitfalls and I will I mean
  • 00:20:13
    I read postgress 13 documentation a
  • 00:20:16
    while back it still hasn't improved to
  • 00:20:18
    an extent that I thought it should have
  • 00:20:20
    by now and it still has its set of
  • 00:20:23
    issues while dealing with massive sets
  • 00:20:24
    of data um but I hope I hope it gets
  • 00:20:27
    better over time and uh if if some if
  • 00:20:30
    some code developers can do it then it
  • 00:20:32
    has to be postest so I hope they do that
  • 00:20:34
    so
  • 00:20:37
    yeah um okay so this is another
  • 00:20:40
    interesting part of the the talk I guess
  • 00:20:44
    but before I get there um I remember
  • 00:20:47
    speaking to someone outside and they
  • 00:20:49
    said that how is your setup like and uh
  • 00:20:52
    what do you do for um
  • 00:20:55
    replica and Master Slave and all of of
  • 00:20:58
    those set of things
  • 00:20:59
    so I guess this will be triggering for
  • 00:21:02
    everyone we don't have a Master Slave at
  • 00:21:04
    all we don't have a replica either uh we
  • 00:21:08
    have one database and one one node we
  • 00:21:12
    have started it using foreign database
  • 00:21:14
    rapper uh why we have shed it like that
  • 00:21:17
    I will explain I'll get to that um but
  • 00:21:20
    we have shed it using foreign database
  • 00:21:21
    rapper so we have divided our data
  • 00:21:23
    across multiple Financial years and kept
  • 00:21:27
    older historical fin IAL years in a
  • 00:21:29
    different uh database server and
  • 00:21:32
    connected both of them using FDB and we
  • 00:21:36
    query the primary DB and it figures out
  • 00:21:38
    from the partitioning that the other the
  • 00:21:42
    data is not in this database server
  • 00:21:44
    right now and it is in the other
  • 00:21:45
    database it figures it out fetches the
  • 00:21:47
    query for us fetches the data for us um
  • 00:21:51
    no slave setup uh our backups are
  • 00:21:53
    archived in S3 we are okay this is by
  • 00:21:57
    the way to reiterate this is uh a back
  • 00:22:00
    office platform we do not promise that
  • 00:22:03
    we'll have 100% off time we are okay
  • 00:22:05
    with that we understand that if
  • 00:22:08
    postgress goes down which has never ever
  • 00:22:10
    happened again thankfully to postgress
  • 00:22:12
    but even if it goes down for whatever
  • 00:22:14
    number of reasons we have been able to
  • 00:22:16
    bring that back up bring the database
  • 00:22:18
    back up by using S3 within minutes and I
  • 00:22:22
    have restarted postgress that is
  • 00:22:24
    pointing towards a completely fresh
  • 00:22:26
    backup from S3 with maybe 15 20
  • 00:22:29
    terabytes of data under under a minute
  • 00:22:32
    or two so it works so there is there
  • 00:22:35
    there might be fancy complicated
  • 00:22:37
    interesting setup to make your replicas
  • 00:22:40
    work but this also works and I many
  • 00:22:43
    people might call it jugar uh hacky way
  • 00:22:46
    of doing things but I don't think it is
  • 00:22:48
    I think it's a sensible approach we
  • 00:22:50
    don't want to over engineer anything at
  • 00:22:52
    all if this works why have a bunch of
  • 00:22:55
    systems that you need to understand just
  • 00:22:57
    to manage manage um a replica setup now
  • 00:23:02
    coming back uh to the question of if we
  • 00:23:05
    don't have a replica how do we load
  • 00:23:07
    balance we don't but what we have done
  • 00:23:11
    differently is that we have a we have a
  • 00:23:14
    second postgress server that sits on top
  • 00:23:17
    of our primary DB and acts like a
  • 00:23:19
    caching layer we have uh we have an
  • 00:23:23
    open-source uh piece of software called
  • 00:23:25
    SQL Java which is a a sync uh job based
  • 00:23:30
    mechanism that keeps pulling the DB and
  • 00:23:33
    then fetches the data stores it in
  • 00:23:34
    another postgress instance um and then
  • 00:23:37
    eventually our app understands that the
  • 00:23:41
    fetch is done data is ready to be served
  • 00:23:43
    and it fetches the DV fetches the data
  • 00:23:46
    from the caching layer so we end up
  • 00:23:49
    creating
  • 00:23:51
    around 500 GB worth of I would say
  • 00:23:54
    around 20 30 millions of tables per day
  • 00:23:58
    uh I remember speaking I remember asking
  • 00:24:00
    someone postgress slack a long time back
  • 00:24:03
    that we are doing this thing where we
  • 00:24:05
    creating 20 million tables a day and
  • 00:24:07
    they like why are you doing this isn't
  • 00:24:09
    there another way of doing it and we're
  • 00:24:11
    like no this works and the reason why we
  • 00:24:13
    do this is because uh postgress in in
  • 00:24:16
    itself supports sorting uh which red
  • 00:24:18
    this doesn't postgress I mean at that
  • 00:24:20
    point of time uh it it lets us do
  • 00:24:23
    pagination it lets us do search on top
  • 00:24:26
    of trading symbols if we need I mean
  • 00:24:28
    search on top of any columns that we
  • 00:24:29
    need to do if necessary so we have
  • 00:24:32
    postgress setting as a caching layer on
  • 00:24:34
    top of our primary postgress and all the
  • 00:24:38
    queries first come to the SQL jobber
  • 00:24:40
    application they go to our primary DB
  • 00:24:44
    nothing gets hammered to the primary DB
  • 00:24:45
    though so the primary DB is not under
  • 00:24:47
    any load at any point of time I mean
  • 00:24:49
    there is a query load but it's not
  • 00:24:51
    getting hammered at all the hammering
  • 00:24:53
    happens to this caching DB which gets
  • 00:24:55
    set eventually at some point of time
  • 00:24:57
    with the data
  • 00:24:59
    and then we serve the data to to our end
  • 00:25:02
    users and that remains for the entire
  • 00:25:04
    day because as I said during the day the
  • 00:25:06
    data doesn't change a lot so we can
  • 00:25:08
    afford to cach this data for the entire
  • 00:25:10
    time duration there are some instances
  • 00:25:12
    in which we need to clear our C clear
  • 00:25:15
    the cache we can just delete the key and
  • 00:25:17
    then this entire process happens all
  • 00:25:18
    over again for that particular user so
  • 00:25:21
    that's our that's how our postgress
  • 00:25:23
    caching layer is Works has worked fine
  • 00:25:25
    for us every night we clean the 500gb so
  • 00:25:28
    how we do is every night uh we have two
  • 00:25:31
    500 GB discs uh pointing at to the
  • 00:25:34
    server we switch from disk one to dis
  • 00:25:36
    two then the disk one gets cleaned up
  • 00:25:38
    then the next day goes to dis goes back
  • 00:25:40
    from disk two to disk one and again the
  • 00:25:42
    new tables are set all over it again
  • 00:25:44
    works fine uh never been an issue with
  • 00:25:47
    this um coming back to our learnings
  • 00:25:50
    with postgress yeah sorry
  • 00:26:05
    can hello are you able to hear me yeah
  • 00:26:09
    yeah uh you know you're telling that
  • 00:26:10
    about kite platform so uh from the kite
  • 00:26:14
    platform data that is LP enement right
  • 00:26:16
    so from the kite platform data you are
  • 00:26:19
    porting to the console database yeah so
  • 00:26:21
    that is a nightly job yeah that's a
  • 00:26:23
    nightly job that's a nightly job yeah so
  • 00:26:25
    that's what you telling in the console
  • 00:26:28
    uh uh system that create millions of
  • 00:26:30
    data right yeah so um okay maybe I
  • 00:26:33
    should explain this again so uh you
  • 00:26:35
    place your orders your trades and
  • 00:26:37
    everything on kite right at the night we
  • 00:26:40
    get a order book or a trade book that
  • 00:26:42
    gets imported into console we do that to
  • 00:26:46
    compute the buy average which is the
  • 00:26:48
    average price at which you bought that
  • 00:26:50
    stock or the profit and loss statement
  • 00:26:53
    which you'll use for taxation for any
  • 00:26:54
    other reason that you might need it for
  • 00:26:56
    that is why we import data into console
  • 00:26:59
    so to fetch these set of statements you
  • 00:27:01
    have to come to console to fetch that
  • 00:27:03
    now when you are fetching the statement
  • 00:27:05
    we this caching layer sits on top of
  • 00:27:07
    that your fetches go to this caching
  • 00:27:10
    layer first it checks if there is
  • 00:27:11
    already a prefetched cach for you ready
  • 00:27:13
    or not if not the query goes through the
  • 00:27:16
    DB the the data is fetched put into the
  • 00:27:18
    caching layer and for the entire day the
  • 00:27:20
    caching layer is serving the data to you
  • 00:27:22
    not the primary DB the primary DB
  • 00:27:24
    remains free at least for you as the
  • 00:27:26
    user so let's say you come in you lay
  • 00:27:28
    around in console you load a bunch of
  • 00:27:29
    reports everything is cashed for the
  • 00:27:31
    entire day in this caching layer so
  • 00:27:33
    primary DB remains as it is till the
  • 00:27:36
    till that night so that night we would
  • 00:27:38
    have gotten all the trades orders any
  • 00:27:40
    things that you have done on your
  • 00:27:41
    trading platform into uh console we
  • 00:27:44
    import all of that we clear our cash
  • 00:27:46
    because it's a fresh set of data your
  • 00:27:48
    pnl your Ledger your financial
  • 00:27:50
    statements have changed because maybe
  • 00:27:51
    you have traded that day maybe have
  • 00:27:53
    bought stocks that day or anything would
  • 00:27:54
    have done happened to your account that
  • 00:27:55
    day so we clear our cach then next year
  • 00:27:59
    when you come and fetch the data again
  • 00:28:01
    all of this is set all over again and
  • 00:28:03
    then whenever you can keep revisiting
  • 00:28:04
    console keep fetching whatever amounts
  • 00:28:06
    of data you want to it will come from
  • 00:28:08
    this cache unless of course you change
  • 00:28:10
    the the date parameters only then we uh
  • 00:28:15
    uh go and fet the data from our primary
  • 00:28:17
    DP but we have realized that most users
  • 00:28:20
    use the data of a particular time frame
  • 00:28:23
    and they don't they don't want to come
  • 00:28:25
    and check for last 3 years what has
  • 00:28:27
    happened it is always last 6 months last
  • 00:28:28
    2 months last one month and they check
  • 00:28:31
    that once they go back and uh we cannot
  • 00:28:36
    obviously build a system where every
  • 00:28:38
    single date range has to be uh equally
  • 00:28:41
    scalable and equally available uh we are
  • 00:28:44
    very aware that the older which I'll
  • 00:28:46
    talk about how we have shed we are very
  • 00:28:48
    aware that our older Financial year data
  • 00:28:50
    points don't need to be available all
  • 00:28:52
    the time at the highest possible metrics
  • 00:28:54
    of a server uh they don't have to be at
  • 00:28:57
    a don't have to be served at a very fast
  • 00:28:59
    rate either right so these are the
  • 00:29:01
    decisions that we have taken and it has
  • 00:29:03
    worked fine for us uh might not work for
  • 00:29:05
    another person but yeah so I I hope that
  • 00:29:09
    answered your question one doubt on that
  • 00:29:11
    kite is also having postgress DB right
  • 00:29:13
    so are using PG dump or postgress
  • 00:29:16
    utilities itself uh no so kite uh uses
  • 00:29:19
    postgress for its market watch uh so
  • 00:29:22
    market watch is the place where you add
  • 00:29:24
    different scripts or different stocks
  • 00:29:27
    and it tells you the current price of
  • 00:29:28
    the stock uh though we have we have
  • 00:29:31
    plans of moving away from that to S DB
  • 00:29:34
    um that has got nothing to do with this
  • 00:29:37
    uh how I mean I guess you're asking a
  • 00:29:39
    more of a how a broker Works question or
  • 00:29:41
    how a trading platform works but you
  • 00:29:43
    place an order the order comes as an
  • 00:29:46
    exchange file at the end of the day for
  • 00:29:48
    us and we import that so there is no PG
  • 00:29:50
    dump that happens from kite to console
  • 00:29:53
    so those are completely two different
  • 00:29:54
    silos that have very little to do with
  • 00:29:56
    each other they rarely share data among
  • 00:29:58
    each other and they're never in the hot
  • 00:30:00
    path because we understand that kite is
  • 00:30:02
    a u extremely fast read and WR platform
  • 00:30:05
    where everything has to happen over
  • 00:30:06
    milliseconds and it can't ever go down
  • 00:30:08
    so these set of fundamentals will not
  • 00:30:10
    really work there so there is no
  • 00:30:12
    connection of PG dumping kite data into
  • 00:30:14
    console kite Works throughout the day
  • 00:30:16
    console Works after the day after your
  • 00:30:19
    trading Market is done that's where you
  • 00:30:20
    come to console and check how well you
  • 00:30:22
    have performed in the day so I that's
  • 00:30:26
    that one more just caching layer do you
  • 00:30:28
    have in kite also caching layer on kite
  • 00:30:31
    yeah it's redis it's predominantly redis
  • 00:30:34
    caching layer so we also use caching uh
  • 00:30:37
    redis caching on everywhere actually
  • 00:30:39
    it's not just kite we pretty much use
  • 00:30:41
    redis like uh if you have used our
  • 00:30:44
    platform coin uh it used to set on a $5
  • 00:30:47
    digital ocean droplet for the longest
  • 00:30:49
    possible time because everything was
  • 00:30:50
    cached on a redis uh instance and used
  • 00:30:53
    to work just fine so we use redis
  • 00:30:56
    predominantly to cach we don't use redis
  • 00:30:58
    in console for these kind of caching
  • 00:31:00
    layer because sorting and pagination is
  • 00:31:02
    not supported on it uh this is a very
  • 00:31:05
    specific requirement here it works here
  • 00:31:07
    so we use postgress here for
  • 00:31:09
    that is it fine yeah that's I use this
  • 00:31:12
    skyen console that's why I asked this
  • 00:31:13
    cool no issues um thank
  • 00:31:17
    you so our learnings with postgress and
  • 00:31:21
    um I'll start off
  • 00:31:23
    with how we because I I remember my my
  • 00:31:27
    summary of my talk uh that is there on
  • 00:31:29
    the posters and Etc outside talks about
  • 00:31:32
    how we Shard and why we Shard the way we
  • 00:31:34
    do it um if you have seen cus DB
  • 00:31:38
    extension or a lot of sharding examples
  • 00:31:40
    all over the world of all the DBS in the
  • 00:31:43
    world how they set it up is have
  • 00:31:47
    a have a have a master DB have a parent
  • 00:31:51
    DB or whatever and have tenets to every
  • 00:31:54
    single child that is connected to it now
  • 00:31:58
    how those tenets uh work is that you
  • 00:32:00
    query the master DB it figures out that
  • 00:32:02
    these set of tenets are in this uh child
  • 00:32:05
    setup or the sharded setup and the query
  • 00:32:08
    goes there we believe that there is no
  • 00:32:13
    reason to add another column that has
  • 00:32:15
    these IDs on it we actually in most of
  • 00:32:18
    our tables we have deleted all our IDs U
  • 00:32:20
    extra data don't need it so we follow
  • 00:32:23
    that in a lot of places so um what we
  • 00:32:27
    did decided was was that we partition
  • 00:32:29
    our database per month because it works
  • 00:32:32
    for
  • 00:32:32
    us then for every single Financial year
  • 00:32:36
    we put it in a different database uh
  • 00:32:38
    server and we connect it via FDB rapper
  • 00:32:42
    and that is our entire sharded
  • 00:32:45
    setup uh has worked fine for us but I
  • 00:32:49
    would I would say
  • 00:32:50
    that at our scale um and our scale
  • 00:32:55
    is 30 40 terab of 50 terabytes you can
  • 00:32:58
    say right now
  • 00:33:01
    um it it's starting to falter a bit it's
  • 00:33:04
    not it's not a great experience anymore
  • 00:33:07
    and which is why we are moving to a very
  • 00:33:09
    different setup different way of
  • 00:33:11
    sharding maybe that is for another talk
  • 00:33:13
    but till now we could scale uh to
  • 00:33:16
    millions of users serving billions of
  • 00:33:19
    requests uh 500 600 GBS of data per day
  • 00:33:23
    using just foreign data rapper and a SQL
  • 00:33:25
    jobber caching layer on top of our
  • 00:33:27
    primary DB no nodes no uh load balancer
  • 00:33:31
    nothing at all um so our learnings of
  • 00:33:36
    postgress um has been that this is
  • 00:33:39
    something that is a there is a gut
  • 00:33:42
    feeling when you write your queries or
  • 00:33:45
    when you write when you look at a
  • 00:33:46
    database schema that uh our gut feeling
  • 00:33:49
    is
  • 00:33:50
    that every query has a time to it like
  • 00:33:55
    for a particular amount of data for a
  • 00:33:57
    particular query should not take more
  • 00:33:58
    than x number of milliseconds I guess
  • 00:33:59
    that comes with experience many of you
  • 00:34:01
    can just look at the data look at the
  • 00:34:03
    query and know that something is wrong
  • 00:34:05
    if even if it's slow by a few
  • 00:34:06
    milliseconds you can figure that out so
  • 00:34:08
    we have a hard limit that certain
  • 00:34:10
    queries cannot cross this limit and we
  • 00:34:12
    optimize and keep on optimizing based on
  • 00:34:15
    that um most of our heavy queries are in
  • 00:34:17
    an async setup like the job or cash you
  • 00:34:20
    said we ensure that none of it is on the
  • 00:34:23
    hot path of an app um there is no glory
  • 00:34:27
    in storing to too much data so we we
  • 00:34:30
    delete a lot of data uh so someone was
  • 00:34:32
    surprised that our total database is 50
  • 00:34:35
    terabytes or um yeah probably around 50
  • 00:34:39
    or 60 not more than that for sure um and
  • 00:34:42
    one of the reasons why it is 50 and not
  • 00:34:44
    500 terabytes is we delete a lot of data
  • 00:34:47
    we do not believe in storing data that
  • 00:34:50
    we do not need what what does it mean is
  • 00:34:53
    that we uh for most of the computations
  • 00:34:56
    that we do for most of the Imports and
  • 00:34:58
    inserts and everything that we do we
  • 00:35:00
    have a hot backup or whatever you can
  • 00:35:03
    call it of the last 15 days or last 15
  • 00:35:06
    days after that we have checkpoint
  • 00:35:08
    backups of last one month last two
  • 00:35:10
    months last 3 months one backup for each
  • 00:35:12
    month we do not have any backup in
  • 00:35:15
    between any of those dates because we
  • 00:35:17
    can go back to any single month and
  • 00:35:19
    regenerate everyday's data till now we
  • 00:35:22
    are okay doing that because we have that
  • 00:35:25
    a night where uh anything can go wrong
  • 00:35:28
    and we can run these set of computations
  • 00:35:30
    and come back to the current state that
  • 00:35:32
    is right now maybe it doesn't work for
  • 00:35:35
    others but I again this is another
  • 00:35:38
    experience that I've learned looking at
  • 00:35:39
    databases of others that there is a lot
  • 00:35:41
    of frivolous data that people like to
  • 00:35:42
    keep for no reason at all because it
  • 00:35:44
    just makes the database looks bigger and
  • 00:35:46
    I don't know makes it looks fancier just
  • 00:35:48
    delete it it doesn't it's back it up in
  • 00:35:51
    a S3 and put it somewhere like don't
  • 00:35:53
    have to be in a database why does
  • 00:35:54
    six-year-old data unless it's a
  • 00:35:56
    compliance that is being set by the the
  • 00:35:59
    company you work for or the organization
  • 00:36:01
    you work for unless it's a compliance
  • 00:36:02
    that you have to do it it can be an S3
  • 00:36:05
    backup it can be a file um doesn't have
  • 00:36:08
    to be in a database and you don't have
  • 00:36:09
    to be responsible for every query of
  • 00:36:12
    last 10 years to be served under 1
  • 00:36:15
    millisecond doesn't make sense it will
  • 00:36:17
    never scale don't do that um the other
  • 00:36:22
    thing that I've also noticed is a lot of
  • 00:36:24
    people write maybe this is a front end
  • 00:36:27
    develop are going into backend issue uh
  • 00:36:29
    where a lot of the logic that should
  • 00:36:32
    have been done by postgress gets done by
  • 00:36:34
    the app and I've noticed that in a lot
  • 00:36:37
    of places and I think that is uh
  • 00:36:39
    something that fundamentally should
  • 00:36:41
    change post this in itself can do a lot
  • 00:36:44
    of competitions like summing average
  • 00:36:47
    window functions you can do so many
  • 00:36:48
    things by overloading into postd rather
  • 00:36:51
    than your app doing it um and I find
  • 00:36:54
    that strange because your app should be
  • 00:36:56
    responsible for just loading the queries
  • 00:36:59
    fetching the data it should not be
  • 00:37:01
    Computing for most of the scenarios I
  • 00:37:03
    think I mean I don't know why this this
  • 00:37:06
    this is something that we had done as a
  • 00:37:08
    mistake too and we learned and I hope
  • 00:37:11
    that uh maybe if there is only one
  • 00:37:14
    learning from my entire talk uh because
  • 00:37:16
    I've noticed this in a lot of places uh
  • 00:37:20
    is to overload your postgress with most
  • 00:37:22
    of the computations it can do it faster
  • 00:37:24
    than any app that you write unless I
  • 00:37:26
    don't know you using r or something else
  • 00:37:28
    but still poist will be really fast so
  • 00:37:30
    try to do that and um yeah uh as you
  • 00:37:35
    would have noticed that our engineering
  • 00:37:37
    setup is very lean we are it's not
  • 00:37:41
    overwhelming or underwhelming it's stay
  • 00:37:43
    whelmed I guess uh we we don't overdo
  • 00:37:46
    anything at all we we
  • 00:37:49
    always uh we always hit the limits of
  • 00:37:52
    what we have right now in every possible
  • 00:37:54
    way and only then look out for other
  • 00:37:57
    Solutions
  • 00:37:58
    and it has worked pretty good for us we
  • 00:38:00
    have never over engineered any of our
  • 00:38:03
    Solutions till now and we have always
  • 00:38:05
    organically found solutions for whenever
  • 00:38:07
    we have come across issues if postgress
  • 00:38:09
    hasn't worked for us then that's fine
  • 00:38:12
    we'll find another solution for it so as
  • 00:38:15
    I said sometimes postgress is might not
  • 00:38:17
    be the answer sometimes a different
  • 00:38:18
    database would be the answer for you
  • 00:38:20
    and you should be I guess humble enough
  • 00:38:22
    to accept that and move on from
  • 00:38:24
    postgress most databases are very
  • 00:38:26
    similar to each other if you go through
  • 00:38:28
    there how they design the data how the
  • 00:38:30
    schemas are made unless you're dealing
  • 00:38:32
    with columa databases they're very
  • 00:38:34
    similar and this the the uh the
  • 00:38:37
    fundamentals remain the same across all
  • 00:38:40
    databases if they are not then that is a
  • 00:38:42
    wrong database so even if so your route
  • 00:38:45
    is experimenting with click house a lot
  • 00:38:47
    and the fundamentals are very similar
  • 00:38:49
    so do not be afraid to experiment with a
  • 00:38:53
    different set of databases we all do
  • 00:38:54
    that a lot in our free time uh because
  • 00:38:57
    because I mean it's a strange way to I
  • 00:38:59
    guess end the talk but post gu might not
  • 00:39:01
    be an answer for every single problem
  • 00:39:03
    though we found an answer for a lot of
  • 00:39:04
    our problems and you should be okay with
  • 00:39:07
    that uh thank
  • 00:39:09
    [Applause]
  • 00:39:14
    you hello um so even if the application
  • 00:39:18
    users are you can have R inside post so
  • 00:39:20
    that that that Sol the problem anyway
  • 00:39:22
    but my question is uh when you say the
  • 00:39:24
    caching layer has 20 million tables um
  • 00:39:27
    do how do you take care of the catalog
  • 00:39:29
    bloat or do you just drop and recreate
  • 00:39:31
    the whole cluster every night we just
  • 00:39:34
    rmrf the entire data okay Fant yeah
  • 00:39:36
    that's what I was thinking the other
  • 00:39:37
    problem is uh even with that um I've had
  • 00:39:41
    scenarios where uh you run into extfs or
  • 00:39:44
    whatever file system related limitations
  • 00:39:46
    on because like poster stores everything
  • 00:39:49
    in a single directory right so have you
  • 00:39:51
    had hit something like that and if so
  • 00:39:52
    what do you do yeah I mean
  • 00:39:55
    U I would I would categorize it as some
  • 00:39:58
    of the mistakes we did at the beginning
  • 00:39:59
    the file limits were wrong at the to
  • 00:40:01
    begin with but post that we' have never
  • 00:40:03
    AC never really come across any file
  • 00:40:05
    limit issues uh we have I mean more than
  • 00:40:09
    happy to admit it we have come across
  • 00:40:10
    issues where the we have run out of
  • 00:40:12
    integers for our
  • 00:40:14
    ID because that's a number of columns we
  • 00:40:17
    stored in one single go that also has
  • 00:40:18
    happened so uh and then the import
  • 00:40:21
    stopped then we had to do a ridiculous
  • 00:40:23
    amount of things alter you know how much
  • 00:40:24
    time would have altering the table would
  • 00:40:26
    have taken but but no we didn't uh it
  • 00:40:29
    was a it was a server configuration
  • 00:40:31
    mistake that from our end but it was
  • 00:40:33
    never the issue of post so I haven't
  • 00:40:36
    come across it in my experience okay
  • 00:40:38
    thank
  • 00:40:41
    you so you said you hardly have any
  • 00:40:44
    crashes or any know downtime so is it
  • 00:40:47
    with some kind of a ha solution or it's
  • 00:40:50
    just you know the instance doesn't crash
  • 00:40:52
    what's the magic oh what's the m i mean
  • 00:40:55
    I think the magic is by the post
  • 00:40:57
    developers no uh I think the reason we
  • 00:41:00
    don't have a lot of Crash is we um we
  • 00:41:05
    have ensured that all our apps are not
  • 00:41:07
    sitting on top of massive databases
  • 00:41:09
    they're always sitting on top of caching
  • 00:41:10
    layers one uh you cannot ever ever ever
  • 00:41:14
    scale an app on top of 10 20 terabytes
  • 00:41:17
    of data and expect it to work without
  • 00:41:18
    crashing it will crash if that happens
  • 00:41:20
    it will overload and we have crashed our
  • 00:41:22
    databases but the mistake was not of
  • 00:41:24
    postgress that is wrong to expect that
  • 00:41:26
    the mistake was that we thought our app
  • 00:41:28
    can easily query that much data in this
  • 00:41:30
    much amount of time and be fine with it
  • 00:41:33
    it will never work as soon as we meet it
  • 00:41:34
    asnc as as soon as we made it uh behind
  • 00:41:37
    our caching layer it worked absolutely
  • 00:41:39
    fine so it's uh again to there's the
  • 00:41:42
    same answer it wasn't the issue of post
  • 00:41:44
    it was our mistake that we had to
  • 00:41:45
    rectify
  • 00:41:51
    thanks okay so we'll take last questions
  • 00:41:55
    after that you go offline questions
  • 00:42:00
    uh this is regarding today's morning
  • 00:42:02
    session right like kaas was addressing
  • 00:42:05
    that uh before covid you could able to
  • 00:42:07
    take uh 2 million request and during
  • 00:42:11
    covid like you are able to scale up to 8
  • 00:42:14
    million to 12 million uh without scaling
  • 00:42:17
    your system how did that
  • 00:42:19
    happen
  • 00:42:22
    um okay um I'm going to S sound a little
  • 00:42:25
    dumb here I guess but caching is a
  • 00:42:27
    magical layer on top of everything I
  • 00:42:29
    guess we were already ready to serve uh
  • 00:42:32
    we did increase we did increase our
  • 00:42:34
    primary DB servers uh the number of
  • 00:42:36
    cores number of parallel workers that
  • 00:42:38
    query the database all of those tuning
  • 00:42:40
    had to change obviously now was it over
  • 00:42:42
    provisioned uh no it was never
  • 00:42:44
    over-provisioned it was always 1db so
  • 00:42:46
    there is no over-provisioning 1 DB it's
  • 00:42:47
    not like it was multi- sharded setup so
  • 00:42:49
    it was 1db we added more cores to it the
  • 00:42:52
    the jobber is a separate server that
  • 00:42:54
    runs the the caching server that we call
  • 00:42:56
    it right right so that was never
  • 00:42:58
    over-provisioned that is still whatever
  • 00:43:00
    we started with it's the exact same
  • 00:43:02
    setup till now 16 CES 32 GB Ram still
  • 00:43:04
    now and that's how we started three
  • 00:43:06
    years back uh works fine um I don't know
  • 00:43:10
    man the I guess that's how good the
  • 00:43:12
    caching layer
  • 00:43:13
    is uh you can say that probably we over
  • 00:43:17
    proficient before that because when you
  • 00:43:21
    we by default start with this 16 uh
  • 00:43:23
    course 32 when you're dealing with a
  • 00:43:25
    pogus DB because we are used to tuning
  • 00:43:27
    it for that so we know the tuning
  • 00:43:30
    parameters for those set of numbers so
  • 00:43:32
    that's how we start off with that
  • 00:43:33
    usually in that case maybe that's how we
  • 00:43:35
    started here like that we thought that
  • 00:43:36
    it would work fine have you ever
  • 00:43:37
    forecasted that have you ever forecasted
  • 00:43:40
    that load uh sorry I couldn't load load
  • 00:43:42
    load tested uh yeah couple of times uh
  • 00:43:45
    the maximum load that we have gone to uh
  • 00:43:48
    was four or five uh and that's it it's
  • 00:43:51
    never been more than that our post
  • 00:43:53
    database has been overloaded multiple
  • 00:43:54
    times and every single time it has been
  • 00:43:57
    loaded has been our mistake where we
  • 00:43:59
    have skipped the caching layer and hit
  • 00:44:01
    the database directly and as I said that
  • 00:44:03
    will never scale it doesn't matter if
  • 00:44:04
    it's one terabyte or 500 GB it it will
  • 00:44:06
    not work so we have every time we
  • 00:44:09
    consciously write a new API endpoint we
  • 00:44:11
    ensure that the first uh thing first
  • 00:44:14
    Frontier has to be the caching layer on
  • 00:44:16
    sitting on top of it and everything has
  • 00:44:18
    to be async it cannot be concurrent uh
  • 00:44:21
    it cannot be concurrent queries hitting
  • 00:44:22
    the DB and uh an HTTP API endpoint
  • 00:44:26
    waiting for the response to happen uh
  • 00:44:28
    again that will not scale your app will
  • 00:44:29
    go down for sure eventually everything
  • 00:44:32
    will be in a weight IO situation and
  • 00:44:33
    nothing will work thank you
Tags
  • PostgreSQL
  • Zerodha
  • fintech
  • basis data
  • indeksing
  • sharding
  • vakum manual
  • materialized views
  • kinerja query
  • lapisan cache