basic
awk
awk -v FS="输入分隔符" -v OFS='输出分隔符' '{if($1==$5) print $1,$5,$10}' filename
查找filename文件(文件中列的分隔符为“输入分隔符”)中,每一行第一列和第五列相等的行,并输出第一列、第五列、第十列,切输出字段分隔符为“输出分隔符”。如果不配置FS和OFS,那么输入输出分隔符均默认为空
exclude a column with awk, 比如打印除第5列的其它所有列
awk ‘{ $5=””; print }’ file
base64
解码
echo [base64-encoded-string] | base64 --decode
编码
echo "your string" | base64
chmod
使文件可以直接执行的命令:chmod +x filename
使所有用户对目录都有读写权限:sudo chmod ugo+rw /opt
1 | r=4,w=2,x=1 |
copy
1 | rsync -aP <sourceDir> <targetDir> |
cut
1 | echo "hello_world" | cut -c1-5 # Output: hello |
rclone
Built-in retry and performance options
By default,
rclone copy
only transfers files and does not create empty directories at the destination. To ensure all directories (including empty ones) are copied, you need to add a specific flag--create-empty-src-dirs
.
1 | dir1 to dir2 注意传输的是 dir1中的文件到 dir2目录下(而不是传输dir1目录到dir2目录) |
parallel
1 | find /path/to/source/ -maxdepth 1 -type d | parallel -j 8 'rclone copy {} /path/to/target/{/} --transfers=32 --checkers=32 --fast-list --buffer-size=512M --max-backlog=100000 --use-mmap --no-traverse --stats=30s --log-file=rclone_{/}.log' |
find Part
1
-maxdepth 1 -type d: Finds top-level directories under point3r/ (excludes files at this level and the root itself).
parallel Part
1
2
3-j 8: Runs 8 parallel rclone jobs, one per subdirectory.
{}: Placeholder for each directory from find.
{/}: Basename of the directory (e.g., subdir1 from the full path).rclone copy Part
1
2
3
4
5
6
7
8
9--transfers=16: 16 concurrent file transfers per rclone job.
--checkers=32: 32 concurrent file checks (e.g., verifying existence or integrity).
--fast-list: Uses a single API call to list files (faster for remote filesystems like CephFS).
--buffer-size=512M: 512 MB buffer per transfer (reduces disk I/O waits).
--max-backlog=100000: Allows up to 100,000 files to queue before processing.
--use-mmap: Uses memory-mapped I/O for reads (can speed up large files).
--no-traverse: Skips directory traversal for remote sources, relying on listings.
--stats=30s: Shows progress every 30 seconds.
--log-file=rclone_{/}.log: Logs each job to a file named after the subdirectory (e.g., rclone_subdir1.log).
delete
1. To delete all files in a directory except filename, type the command below:
1 | rm -v !("filename") |
2. To delete all files with the exception of filename1 and filename2:
1 | rm -v !("filename1"|"filename2") |
3. The example below shows how to remove all files other than all .zip
files interactively:
1 | rm -i !(*.zip) |
4. Next, you can delete all files in a directory apart from all .zip
and .odt
files as follows, while displaying what is being done:
1 | rm -v !(*.zip|*.odt) |
5. 删除指定目录下指定日期的目录,可以使用 find
和 rm
命令来删除指定目录下指定日期的目录
1 | find /path/to/directory -type d -mtime +365 -exec rm -rf {} \; |
6. 删除指定目录下前一个星期的文件,可以使用 find
和 rm
命令来删除指定目录下指定日期的文件
1 | find /path/to/directory -type f -mtime +7 -exec rm {} \; |
or
1 | find /path/to/directory -type f -mtime +7 -delete |
可指定相关名称
1
find /var/log -name "*.log" -type f -mtime +30
du
1 | du -h --max-depth=1 --exclude='proc' --exclude='home' --exclude='mnt' |
dd
1 | dd if=<input> of=<output> [options] |
• if: Specifies the input file or device (e.g., /dev/sda or /path/to/file).
• of: Specifies the output file or device (e.g., /dev/sdb or /path/to/file).
• bs=SIZE: Block size, defining how much data to read and write at a time. It can be set to values like 512, 4M, etc. Example: bs=1M reads and writes 1 megabyte at a time.
• count=N: Limits the number of blocks copied. For example, count=100 copies 100 blocks of the size specified by bs.
• status=progress: Shows real-time progress while copying.
• conv=notrunc: Prevents truncation of the output file (useful when appending).
• conv=sync: Pads the input to the full block size with null bytes if necessary.
1 | Copy specify file to specify path |
diff
1 | 省略相同内容的展示 |
find
- 查找具体文件
1 | find / -name 文件名称 |
- 统计目录下所有文件数量
1 | find /path/to/directory -type f | wc -l |
- 查找目录
1 | find </your/start/path> -type d -name <name> -exec dirname {} \; |
- 查找指定用户的文件
1 | find ./* -user 用户名 |
- 查找指定用户组的文件
1 | find ./* -group 用户组 |
匹配查找除了某个特定文件类型以外的所有文件,并将结果传递给
rm
命令进行删除1
find . ! -name "*.txt" -delete
匹配多个
1
find . ! \( -name "log4j*" -o -name "flink*" \)
指定天数数据
1
2
3
4
5查找30天之前的文件
find <directory> -type f -name "*.tar.gz" -mtime +30
查找30天之内的文件
find <directory> -type f -name "*.tar.gz" -mtime -30
findmnt
列举所有挂载盘
1 | list all |
grep
- 查找指定目录下的文件内容
1 | grep -rn "info" * |
- 查询大文件里面的内容
1 | // 使用管道符可以实现过滤既满足时间又满足ip的行。 |
- logical
1 | Match any of multiple strings (logical OR) |
查看周围行数的内容
1
2
3-A n # Show n lines after the match
-B n # Show n lines before the match
-C n # Show n lines around the match
ls
ls -lh
以可读性G、M查看文件的大小
link
创建软连接
1
ln -s [源文件或目录] [目标文件或目录]
查找指定目录的软连接文件
1
ls -alR | grep ^l
relink
1
2
3ln -sf <new_target> <link_name>
-s: Create a symbolic link.
-f: Force overwriting an existing link.
nbsp
显示 nbsp
1 | 方式一 |
Character | Normal Space | Non-Breaking Space |
---|---|---|
Unicode codepoint | U+0020 | U+00A0 |
UTF-8 representation | 0x20 | 0xC2A0 |
Visible? | No (looks like a space) | No (looks like a space) |
Breaks line? | ✅ Yes | ❌ No |
YAML-safe? | ✅ Yes | ❌ No — may cause parse errors |
sed
替换字符
linux环境:
1
sed -i 's/Search_String/Replacement_String/g' Input_File
mac环境(需要设置备份,以防文件损坏)
1
sed -i .bak 's/Search_String/Replacement_String/g' Input_File
删除指定多行
1
sed -i '1,5d' example.txt
sort
1 | sort --parallel=8 -S 4G -T /data -k2,3 largefile.txt > sorted_file.txt |
使用了8个线程并行排序,并且sort命令在排序过程中最多使用4GB的内存缓冲区。我们还使用了
-T /data
选项,指定sort命令使用/data目录来存储临时文件,而不是默认路径。“-k1,2”表示先按照第1列排序,若第1列相同则按照第2列排序。
tar
c – Creates a new .tar archive file.
x — to untar or extract a tar file
v – Verbosely show the .tar file progress.
f – File name type of the archive file.
z — gzip archive file
j — bz2 feature compress and create archive file
t — to list the contents of tar archive file
1 | extract files to a specific destination directory |
tr
tr – translate or delete characters
大小写转换
1
2cat file | tr A-Z a-z
cat file | tr a-z A-Z
wc
- 语法
1 | 语法:wc [选项] 文件… |
1 | 统计目录下所有文件数量 |
wget
- 下载指定目录
1 | wget -r --no-parent http://abc.tamu.edu/projects/tzivi/repository/revisions/2/raw/tzivi/ |
senario
文件系统
1 | Check filesystem type |
创建文件
1 | method 2 |
dd
1 | dd if=/dev/zero of=/path/to/directory/filename bs=block_size count=number_of_blocks |
if=/dev/zero
: Input file (/dev/zero
provides null bytes; use/dev/urandom
for random data).of=/path/to/directory/filename
: Output file path and name.bs
: Block size (e.g.,1K
,1M
,1G
).count
: Number of blocks to write
文件分割
1 | split [-a] [-b] [-C] [-l] [要分割的文件名] [分割后的文件名前缀] |
将多个分割的文件进行合并
1 | cat files_name_1 files_name_2 files_name_3 > files_name |
按行数分割
1
split -l 10000 bigfile.txt smallfile
分割之后的文件不影响读取
统计某个文件中的字符数,需要注意的是,如果文件中包含多字节字符(如中文),则每个字符将被视为多个字符来计算。
1
wc -c /path/to/file
在这基础上,统计内容所占KB
1
wc -c /path/to/file | awk '{print $1/1024}'
awk对文件按照指定多列的内容进行排序
1
awk '{print $0}' head_100.csv | sort -t ',' -k2,3 > head_100_sort.csv
并用
sort
命令根据指定列的内容进行排序。-t
选项表示使用制表符作为字段分隔符,[列数]
是你要排序的那一列,“-k1,2”表示先按照第1列排序,若第1列相同则按照第2列排序。统计字符的长度
1
echo 字符 | wc -m
批量替换文件名
1 | rename -n -e 's/待替换字符串/替换字符串/' *.png |
转换文件编码格式
查看编码
1
2
3
4
5vim
:set fileencoding
file
file -I filename转换编码
然后使用 iconv
进行编码格式的转换,比如将一个 utf-8 编码的文件转换成 GBK 编码,命令如下:
1 | iconv -f UTF-8 -t GBK input.file -o output.file |
如果遇到]iconv: 未知xxxx处的非法输入序列,一种解决方法是加入 -c选项:忽略无效字符
1
iconv -c -f gb2312 -t utf8 test.txt -o output.file
1 | iconv -f gb18030 -t UTF-8 input.file -o output.file |
格式化json
1 | echo '{"kind": "Service", "apiVersion": "v1", "status": {"loadBalancer": true}}'|jq . |
加密
用zip命令对文件加密压缩和解压
1 | zip -re filename.zip filename |
markdown转word
1 | pandoc -o output.docx -f markdown -t docx filename.md |
troubleshooting
磁盘展示空间和实际空间不符
1 | check deleted process |
拷贝之后,空格编码不一致
替换 no-breaking spaces(nbsp)
为 正常空格
1 | sed -i 's/\xC2\xA0/ /g' <file> |
清理挂载盘之前相同位置的数据
比如之前数据存放在 /mnt/disk0/data (使用了系统盘/ ,挂载了磁盘vda) ,但是后来挂载了新磁盘vdb到 /mnt/disk0。这样,原来 /mnt/disk0/data的数据还存在在系统盘中。
1 | sudo mkdir /mnt/old_disk0 |
tree命令报错
error
1
. [error opening dir]
locate
1
2
3
4
5dmesg
print
[5309728.436569] audit: type=1400 audit(1754363848.580:151): apparmor="DENIED" operation="open" profile="snap.tree.tree" name="/mnt/dingofs/dingospeed/data/repos/repos/files/models
/" pid=945507 comm="tree" requested_mask="r" denied_mask="r" fsuid=0 ouid=0resolve
1
2
3
4snap remove tree
apt install tree
Refresh Bash’s command cache for the current shell
hash -r