How CYFS achieves permanent available and fast access cyfs linkquickly access link
With the widespread and development of Web3, more people realized the maladies of Web2. For example: Service operators owns users’ data, service providers own all rights, unreliable http link and so on. However, although the concept of web 3 is widely spread, there is no truly decentralized storage facility on this market. In this case, the protocol of CYFS is the first one on global truly realized the core concepts of web3. It can secure the decentralized storage of users' data, ensure the ownership of data belonging to the users themselves, guarantees the right of users to freely post content which be accessed by others, and the permanent availability of data and data access links. The envision of CYFS is just as the saying goes, carved into stone.
As follows, we will share the architecture and innovation of CYFS from technical perspectives to clarify how CYFS achieve Web3 envision.
The kernel of decentralization --OOD
First of all, every user should have their own Owner Online Device (OOD) to access CYFS network, which is the key in both of our innovation and decentralization CYFS network. On the OOD, users can store their own data and run its own services with their own private keys. Users have controls over their own data and service as long as they master private keys, which is to achieve real decentralization, as those who own BTC with mastering private keys.
We can connect various smart devices including computers, smartphones, smart hard ware and cars into OOD, forming a zone which allows those devices accessing Web3 network through your OOD.
Immuatable content-oriented link-- CYFS link
The problem of HTTP links is that the content HTTP links point to is editable, and the control of these links belongs to the service providers. If the service provider deletes your content or no longer operates the service, then your content published cannot be viewed anymore. However, things are changed when it comes to CYFS link.
Before going on, it is necessary to clarify a concept. In CYFS link, one data object can use Hash algorithm to generate an ID. Anyone who obtains this object can check whether object is original content or not by this ID, which is named as ‘Named-Object’. Meanwhile, the creator can write his signature of the data in object so that anyone who obtains the data can verify the creator's ownership of the data by verifying the signature with the creator's public key. In this way, we make data ownership truly owned by the user.
Now, if you want to share a work belonging to you with your friend Alice, you could generate an image CYFS link sending to her. This link contains two segments, first is your Zone-ID, second is data object ID. From the link below, it can be seen that cyfs link is the unchangeable content-oriented link.
When Alice obtains the links, we need Alice enable to access your OOD to get the image directly. Traditional P2P network relies on nodes to constantly communicate with each other to maintain some other nodes’ information. When a node needs to be retrieved, it is normal to use flooding method to search through each node one by one, which not only cost time, but is also unreliable. For example, IPFS is a typical case.
Hence, we made improvements for it that our OOD do not maintain additional node information. There is a decentralized public chain called Metachain in the CYFS architecture, which assumes functions similar to DNS. Whole framework is as below.
If you want someone to find you, you could put your "zone ID: latest zone_info" key-value pair on the Metachain. Zone_info will contain information about your zone's configuration and OOD. When Alice gets the cyfs link of the image, her first step is to ask Metachain for the latest information of the zone through the link’s first zone ID (namely host), and then connect OOD, which is the function of DNS. Next is to request for an image from your OOD by connecting the second object ID. Final step is to verify whether object ID is original data or not and save it on your own OOD after receiving the image and then you can view the picture in your own locality. It can be seen that the entire access process is completely decentralized without interfere from third-party. As long as you maintain your OOD, then this link can be quickly accessed. In this way, CYFS ensures everyone’s equal right to publish their content and the right to be viewed by others.
The Reliable Storage of Data
Some people may ask, in the case of managing OOD on your own, what should I do if something happened to your OOD, for instance, went down OOD, broken hardware, data lost and so on?
This is a great question in solving reliable data storage problem. As mentioned above, users are responsible for their data, but the importance of different data weight differently. Majority of data on the Internet are less essential. Obviously, you will delete chat history regularly instead uploading all your phone data to the cloud, but upload some important files and pictures. Therefore, it should be decided by users that how important the data is and how much they are willing to pay. Different importance of data should have different reliability. This idea is different from traditional decentralized storage design mindset hoping all data in the network can not be lost, deleted and keep the highest reliability. Depending on the importance of the data, users have several different scenarios to prevent data loss:
One solution is that users can buy more OODs to establish masters-slaves model and back up data with multiple OODs.
Back Up on Other users’ OOD
For more important data, local multi-OOD is still not reliable enough. So CYFS designed the DSG protocol, which can store encrypted data on the idle space of other users’ devices with paying a certain space rental fee to the space provider，as shown below:
The basic process of DSG is as follows:
Assuming Bob, a user, wants to restore its data on Dave’s OOD, firstly, Bob and Dave need to sign a digital agreement that sets the price and duration of storage. Following executing the contracts, Bob needs to encrypt his data and store it on Dave's OOD. Bob will randomly generate a symmetric key K on his OOD and symmetrically encrypt the data to be backed up S to get D. Meanwhile, it uses its own public key to asymmetrically encrypt K to obtain K', attaching K' to S and then stores them in Dave’s OOD. K' can only be solved with Bob's private key, so Dave can't steal Bob's data. When the contract begins to execute, Bob does not pay Dave all the storage fees at once to prevent Dave from storing data as agreed after getting the fee. Instead, solution is opening a lightning network for Bob and Dave, Bob pays Dave once for a while, just like paying monthly rent. However, it should be noticed that, before paying the rent, Bob needs to launch a proof of storage challenge to get Dave to prove that he still has Bob's complete data. The method for proof of the storage is that Bob first randomly select an interval (16K is enough), then asks Dave to take the data content of the interval on the S. Next is to obtain a result which Bob can verify the result locally through an unique algorithm. If Bob's local calculations match the results provided by Dave, which means Dave has the complete data of S. As a result, Bob can pay for it.
When Bob needs to retrieve his data, he also needs to pay gas fees to Dave in batches. Bob will only pay a portion of the fees with Dave transmitting a piece of data until the transfer is complete.
With the DSG protocol, users can achieve multiple off-site backups. With more backups, the lower possibility of data being deleted or lost. The meta information of these backup data will be stored on the metachain for future queries. Even if a worst situation occurs one day, the user's OODs are all damaged at the same time, as long as the user still retains his own private key, when he starts his OOD anywhere in the world, he can obtain the list of his backed up data from the metachain. In this way, through the private key, he can pull back the important data he has backed up around the world to ensure that the data still exists.
Hence, users’ data will be permanently stored as long as they are willing to pay enough cost and the link will not be shown as 404.
Based on the DSG protocol, a decentralized storage matching market can be built on top of CYFS. The matching market allows each storage demander and storage supplier to choose and trade freely, ensuring a fair and open market where prices are reasonable and has more mechanisms to protect both sides’ rights. At present, there is a DMC public chain on CYFS, which is the first decentralized storage matching market in the CYFS network. The working mode of DMC is as below.
Large Scale Distribution (LSD)
Generally, those designs are sufficient, but there are still many scenarios on the Internet that requires large-scale distribution of content in a short period of time. So we designed BDT protocol aiming to help application developers solve the problem of conventional content distribution in decentralized networks efficiently and make the entire network have a good overall load.
LSD in Social Network
Social network is a typical scenario. For instance, if you published a picture on your individual web page and it hit on the Internet, it may lead millions people want to access this picture in one day. So in this case, how can so many people quickly reconstruct this picture in their own local area? How can your OOD carry so many visits when you only a normal user?
For this question, let us analysis this scenario. Firstly, we could find that for the image requester, no matter where he obtained this image, he can verify whether the image is the original data by verifying the Object ID. Additionally, your image is widely distributed through social network, in other words, the vast majority of these million people see these pictures through level-by-level sharing.Each sharer of the image will have a copy of the image on their own OOD.
Thus, the image requester can try to obtain pictures from near nodes where already have pictures, such as the parent-node and brother-nodes in the image propagation, instead of directly requesting your OOD. In this case, your OOD access pressure is greatly reduced.
From this point of view, more valuable your content is to others, more people will back up your content providing access to content for others, which helps you keep links available.
When there are multiple sources, for small files, we can use binary synchronous source transmission to improve transfer efficiency. One source starts transmission from top of the file, the other from the end, till the file requester receives all data. If it is a large file, it could be divided into small pieces and then use binary source transmission transfer them.
Another way to optimize the load is to work with Internet Service Providers (ISP) to further improve transmission efficiency and reduce the load on the entire network. Assuming your OOD is in Los Angles but your 100 fans’ OOD are in New York. If they all ask you for images, you need to send image data from Los Angles to New York 100 times. However, if the operator's core router in New York can cache picture data during the first transmission, then when other friends request pictures from you, the router can recognize the requests of BDT and directly intercept these requests into sending them the cached data. In this case, only one of the 100 requests is sent from Los Angeles to New York. The remaining 99 data transmissions are transmitted within New York City, which can reduce 10% of the overall load of the network. When the proportion of traffic flow writing in BDT protocol increases, it would also be beneficial for the operator, so they have the motives to upgrade the router.
Instant Mass Distribution of Data
Now, there are solutions for large-scale distribution based on social networks. Nevertheless, there are also scenarios that require more transmission, such as online group chat. If you post your image in a large group which up to 100,000 people, those people in the group need to get the image immediately. For this situation, it could use multi-source focused transmission. The strategy of this transmission needs to be designed based on specific scenarios. The core concept is to divide the file into small pieces and divide the work for everyone. Each part of the people preferentially obtains different small pieces and transmits this data block to others while receiving the data block, so that everyone can jointly ensure the picture can be quickly distributed to everyone.
The design mentioned above relies on everyone’s actively distribution to the network communication. If the data disseminator wants his content to spread faster and is willing to pay more cost, he can pay to reward nodes in the network for providing additional content distribution services. When a node starts providing additional data uploads for financial gain, it becomes a CacheNode in the BDT network. The BDT agreement contains proof of transmission to protect the interests of both parties.
cyfs:// guarantees people's fundamental rights to save and publish content on the Internet. This rights are essential, it will reshape the basic logic of many Internet behaviour. Additionally, with the confirmation of data rights, it can be seen that the era of content creators is coming soon, which means Internet users will upgrade from Internet-users to the so-called Internet citizens.