Recently, I was performing a private deployment of our company’s SAAS product at the customer site. Since the customer’s network cannot access the internet, data can only be transferred through a Linux jump server. In addition to the k8s offline deployment program and images, there are also several hundred GB of pre-cut video data, which can only be transferred via Baidu Netdisk. I wondered if it was possible to synchronize Baidu Netdisk data via the command line. A quick Google search revealed that it is indeed possible. Below, I will briefly introduce the use of bypy.

Introduction to bypy

bypy is a Python client for Baidu Cloud/Baidu Netdisk, mainly used for operating Baidu Netdisk under Linux. It provides functionalities such as file listing, downloading, uploading, comparison, upsync, and downsync. Key features include support for Unicode/Chinese, retry on failure, recursive upload/download, directory comparison, and hash caching.

Due to Baidu PCS API permission restrictions, the program can only access files and directories under the /apps/bypy directory in Baidu Cloud.

Installing bypy

Install pip

curl -O https://bootstrap.pypa.io/pip/2.7/get-pip.py
curl -O https://bootstrap.pypa.io/pip/3.6/get-pip.py
python get-pip.py 

Install bypy

pip install bypy
python -m bypy info # Generate an authorization code, then open a browser to authorize and enter the authorization code

Copy and paste the authorization code into the terminal, press Enter, and you’re done.

Detailed bypy Commands

$ bypy help command # View command help
$ bypy info  # View Baidu Netdisk space
Quota: 13.294TB
Used: 1.917TB
$ bypy list # View "My Application Data" in Baidu Netdisk
/apps/bypy ($t $f $s $m $d):
D course 0 2023-12-04, 15:21:41
D k8s 0 2023-12-04, 15:47:47

$ bypy downdir k8s # Directly download the k8s directory under bypy
$ bypy syncup # Sync the current directory to the cloud
$ bypy upload # Sync the current directory to the cloud
$ bypy syncdown # Sync the cloud content to the current directory
$ bypy downdir / # Sync the cloud content to the current directory

$ bypy compare # Compare the local current directory with the root directory of the cloud (program)
$ bypy -v # Adding the -v parameter during runtime will display progress details.
$ bypy -d # Adding -d during runtime will display some debug information
$ bypy -ddd # Display more HTTP communication information

Summary

This concludes my exploration of using bypy. During my research, I found many tools based on baidupcs, but most of these projects are no longer maintained. I hope bypy can continue to be supported.

Reference: https://github.com/houtianze/bypy