ProDeveloperTutorial.com

Tutorials and Programming Solutions
Menu
  • Shell Scripting
  • System Design
  • Linux System Programming
  • 4g LTE
  • Coding questions
  • C
  • C++
  • DSA
  • GIT
  • 450 DSA Cracker
  • 5G NR
  • O-RAN

System Design Tutorial Example 4: System design for online file sharing services

prodevelopertutorial February 16, 2019

In this chapter we shall learn system design for online file sharing services like Dropbox or google drive.

Below is the list of features that we shall be discussing:

  1. Upload and download
  2. Sync data
  3. History of file changes

Assumptions on the number of requests that the server will be receiving:

10Million users

100Million requests/day

High read and write

Designing a file sharing service is not as simple as uploading a document and whenever the file changes, upload the file again. This is not going to work.

 

Consider the following scenario:

If the user has a very long file like 20MB and every time the user makes a changes, you will be uploading whole file, if he makes just small changes like adding a space or changing a spelling mistake for 5 times, then it will be 100MB bandwidth. That is not efficient.

So instead of uploading the whole file, what we can do is to break the file into smaler parts of 2MB each. Thus making 10 parts or chunks. So for the first time, we upload all the files to the cloud. Then the user makes a changes, then we upload only the chunk where the file has been updated. Thus saving bandwidth and the latency. For this you can use HDFS to serve thsi purpose.

So now let us talk about how to design the service.

System Design Tutorial Example

From the above image we have a client that is installed in mobile device or on a laptop. Below is the basic list of components the client should have to provide basic file sharing functionality:

4.1. Watchman:

Initially when we setup a client, we configure a folder to  the client. The watchman will be watching this folder. When there is a change in the folder, it will automatically notifies the Divider and Indexer.

4.2. Divider/Chunker:

Once the divider gets a notification from the watchman, that a new file has been added, it will divide the data into chunks and uploads to the cloud. You can use AmazonS3.

4.3. Indexer:

Once the divider divides the file, it will take a hash value of the data. Then divider service uploads the data to the cloud, it will get the URL where it is stored in the cloud. The hash of the data along with the URL will be maintained by the indexer.

4.4. DB indexer:

It will save the data that has been received from indexer.

Then we have a messaging service that will have the queue of the data that has been changed. This is needed because, there might be multiple clients, if one client changes the data, that data should be replicated in all the clients. The messaging service will broadcast the updated data to all the clients connected. Now the clients will learn that there has been a file change, they will look into the indexer and compare if all the data are same or different. If the data is different, then it will updated the data. You can use RabbitMQ or kafka for message queue.

 

4.5 Metadata:

We need a metadata to store the information like file history, message chunk hash and its uploaded URL, the clients connected to the service. As we need the data to be consistent, we need to use RDBMS as there will be many clients connected. If you use NoSQL it will provide eventual consistency.

This is the basic system design for file upload service.

 

List Of Tutorials available in this website:

C Programming 20+ ChaptersC++ Programming 80+ Chapters
100+ Solved Coding QuestionsData Structures and Algorithms 85+ Chapters
System design 20+ ChaptersShell Scripting 12 Chapters
4g LTE 60+ ChaptersMost Frequently asked Coding questions
5G NR 50+ ChaptersLinux System Programming 20+ chapters
Share
Email
Tweet
Linkedin
Reddit
Stumble
Pinterest
Prev Article
Next Article

About The Author

prodevelopertutorial

Follow this blog to learn more about C, C++, Linux, Competitive Programming concepts, Data Structures.

Leave a Reply Cancel Reply

You must be logged in to post a comment.

ProDeveloperTutorial.com

Tutorials and Programming Solutions
Copyright © 2023 ProDeveloperTutorial.com
We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept”, you consent to the use of ALL the cookies.
Do not sell my personal information.
Cookie SettingsAccept
Manage consent

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
CookieDurationDescription
cookielawinfo-checkbox-analytics11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional11 monthsThe cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy11 monthsThe cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
Functional
Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
Performance
Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.
Analytics
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.
Advertisement
Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.
Others
Other uncategorized cookies are those that are being analyzed and have not been classified into a category as yet.
SAVE & ACCEPT