noobaa

Terminology

NSFS

NSFS (short for Namespace-Filesystem) is a capability to use a shared filesystem (mounted in the endpoints) for the storage of S3 buckets, while keeping a 1-1 mapping between Object and File.

reference

NSFS on Kubernetes

  • FS backend supported types are GPFS, CEPH_FS, NFSv4 default is POSIX

code

image

1
2
3
4
5
6
7
8
9
10
11
make noobaa  # include: 1 make build, 2 make base, 3 make noobaa

# include below operate
# 1. make noobaa-builder image
docker build --build-arg CENTOS_VER=9 --build-arg BUILD_S3SELECT=1 --build-arg BUILD_S3SELECT_PARQUET=0 -f src/deploy/NVA_build/builder.Dockerfile -t noobaa-builder .

# 2. make noobaa-base image
docker build --build-arg BUILD_S3SELECT=1 --build-arg BUILD_S3SELECT_PARQUET=0 -f src/deploy/NVA_build/Base.Dockerfile -t noobaa-base .

# 3. make noobaa-core image
docker build --build-arg CENTOS_VER=9 --build-arg BUILD_S3SELECT=1 --build-arg BUILD_S3SELECT_PARQUET=0 -f src/deploy/NVA_build/NooBaa.Dockerfile -t noobaa --build-arg GIT_COMMIT="d6feb0a" .
  • 如果修改代码需要执行

    1
    2
    3
    4
    make base  # 制作 noobaa-base:dingofs
    make noobaa # 制作 noobaa-core:dingofs-v1.0
    docker tag noobaa-core:dingofs-v1.0 harbor.zetyun.cn/dingofs/noobaa-core:dingofs-v1.0.x
    docker push harbor.zetyun.cn/dingofs/noobaa-core:dingofs-v1.0.x

quay archifact

1
2
docker pull quay.io/noobaa/noobaa-builder:master-20250623
docker pull quay.io/noobaa/noobaa-base:master-20250623

workflow

  • build image

    1
    .github/workflows/manual-full-build.yaml

Deploy

1
2
3
4
curl -LO https://github.com/noobaa/noobaa-operator/releases/download/v5.18.4/noobaa-operator-v5.18.4-linux-amd64.tar.gz
tar -xzvf noobaa-operator-v5.18.4-linux-amd64.tar.gz
chmod +x noobaa-$OS-$VERSION
mv noobaa-$OS-$VERSION /usr/local/bin/noobaa

install

  • default sc

    default-sc-dingofs-noobaa.yaml

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    kind: StorageClass
    apiVersion: storage.k8s.io/v1
    metadata:
    name: default-sc-dingofs-noobaa
    annotations:
    storageclass.kubernetes.io/is-default-class: "true"
    provisioner: csi.dingofs.com
    allowVolumeExpansion: true
    reclaimPolicy: Retain
    parameters:
    csi.storage.k8s.io/provisioner-secret-name: dingofs-secret-noobaa
    csi.storage.k8s.io/provisioner-secret-namespace: dingofs
    csi.storage.k8s.io/node-publish-secret-name: dingofs-secret-noobaa
    csi.storage.k8s.io/node-publish-secret-namespace: dingofs
    pathPattern: "${.pvc.namespace}-${.pvc.name}"
    mountOptions:
    - diskCache.diskCacheType=2
    - block_cache.cache_store=disk
    - disk_cache.cache_dir=/dingofs/client/data/cache/0:10240 # "/data1:100;/data2:200"
    - disk_cache.cache_size_mb=102400 # MB
  • prepare

    1
    2
    3
    sudo ctr -n k8s.io images pull dockerproxy.zetyun.cn/docker.io/noobaa/noobaa-core:master-20250623
    sudo ctr -n k8s.io images pull dockerproxy.zetyun.cn/docker.io/noobaa/noobaa-operator:5.18.4
    sudo ctr -n k8s.io images pull quay.io/sclorg/postgresql-15-c9s:latest
  • install

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    kubectl create ns noobaa
    kubectl config set-context --current --namespace noobaa

    # use internal postgres
    noobaa install --noobaa-image=dockerproxy.zetyun.cn/docker.io/noobaa/noobaa-core:master-20250623 --operator-image=dockerproxy.zetyun.cn/docker.io/noobaa/noobaa-operator:5.18.4 --db-image=quay.io/sclorg/postgresql-15-c9s:latest --namespace=noobaa

    # use external postgres
    noobaa install --noobaa-image=dockerproxy.zetyun.cn/docker.io/noobaa/noobaa-core:master-20250623 --operator-image=dockerproxy.zetyun.cn/docker.io/noobaa/noobaa-operator:5.18.4 --postgres-url="postgres://postgres:noobaa123@10.220.32.18:5432/nbcore" --namespace=noobaa

    # use dingofs image
    noobaa install --noobaa-image harbor.zetyun.cn/dingofs/noobaa-core:dingofs-v1.0 --operator-image dockerproxy.zetyun.cn/docker.io/noobaa/noobaa-operator:5.18.4 --namespace=noobaa --debug-level all
  • status

    1
    noobaa status --show-secrets
  • uninstall

    1
    2
    3
    4
    5
    6
    7
    noobaa uninstall --cleanup

    # clean expired data
    kubectl exec -it <noobaaFS-debug-pod> -n dingofs -- bash
    cd /dfs/noobaa-debug-pv-xxx
    rm -rf noobaa-db-noobaa-db-pg-0
    rm -rf noobaa-noobaa-default-backing-store-noobaa-pvc-xxx
  • upgrade

    1
    2
    3
    4
    5
    6
    7
    noobaa upgrade --noobaa-image <noobaa-image-path-and-tag> --operator-image <operator-image-path-and-tag>

    # image update
    sudo ctr -n k8s.io images pull harbor.zetyun.cn/dingofs/noobaa-core:dingofs-v1.0.3

    # e.g.
    noobaa upgrade --noobaa-image harbor.zetyun.cn/dingofs/noobaa-core:dingofs-v1.0.3 --operator-image dockerproxy.zetyun.cn/docker.io/noobaa/noobaa-operator:5.18.4 --debug-level all

pod

1
2
3
4
5
noobaa-core-0                                      2/2     Running   0          2m5s
noobaa-db-pg-0 1/1 Running 0 2m6s
noobaa-default-backing-store-noobaa-pod-cb1747eb 1/1 Running 0 19s
noobaa-endpoint-79677c7dd9-cjqdj 1/1 Running 0 42s
noobaa-operator-7d969db69c-gmlm4 1/1 Running 0 2m28s

default resource

  • noobaa-core-0

    1
    2
    3
    4
    5
    6
    Limits:
    cpu: 999m
    memory: 4Gi
    Requests:
    cpu: 999m
    memory: 4Gi
  • noobaa-db-pg-0

    1
    2
    3
    4
    5
    6
    Limits:
    cpu: 500m
    memory: 4Gi
    Requests:
    cpu: 500m
    memory: 4Gi
  • noobaa-default-backing-store-……

    1
    2
    3
    4
    5
    6
    Limits:
    cpu: 100m
    memory: 400Mi
    Requests:
    cpu: 100m
    memory: 400Mi
  • noobaa-endpoint-……

    1
    2
    3
    4
    5
    6
    Limits:
    cpu: 999m
    memory: 2Gi
    Requests:
    cpu: 999m
    memory: 2Gi
  • noobaa-operator-…

    1
    2
    3
    4
    5
    6
    Limits:
    cpu: 250m
    memory: 512Mi
    Requests:
    cpu: 250m
    memory: 512Mi

endpoint

Endpoints are deployed as a deployment with autoscaling, so the minCount/maxCount values should be used to set a range for the autoscaler to use, and this is typically how you can increase the system’s S3 throughput. It should be preferred to increase the number of endpoints rather than increasing the resources for each endpoint.

replica

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
kubectl patch noobaa noobaa --type merge --patch '{
"spec": {
"endpoints": {
"minCount": 3,
"maxCount": 3,
"resources": {
"limits": {
"cpu": "4",
"memory": "4Gi"
},
"requests": {
"cpu": "4",
"memory": "4Gi"
}
}
}
}
}'

boostrap

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
# /noobaa_init_files/noobaa_init.sh init_endpoint
init_endpoint() {
fix_non_root_user

# nsfs folder is a root folder of mount points to backing storages.
# In oder to avoid access denied of sub folders, configure nsfs with full permisions (777)
if [ -d "/nsfs" ];then
chmod 777 /nsfs
fi

cd /root/node_modules/noobaa-core/
run_internal_process node --unhandled-rejections=warn ./src/s3/s3rver_starter.js
}

run_internal_process() {
while true
do
local package_path="/root/node_modules/noobaa-core/package.json"
local version=$(cat ${package_path} | grep version | awk '{print $2}' | sed 's/[",]//g')
echo "Version is: ${version}"
echo "Running: $*"
$*
rc=$?
echo -e "\n\n\n"
echo "######################################################################"
echo "$(date) NooBaa: Process exited RIP (RC=$rc)"
echo "######################################################################"
echo -e "\n\n\n"

mode="manual" # initial value just to start the loop
while [ "$mode" == "manual" ]
do
# load mode from file/env
if [ -f "./NOOBAA_INIT_MODE" ]
then
mode="$(cat ./NOOBAA_INIT_MODE)"
else
mode="$NOOBAA_INIT_MODE"
fi

if [ "$mode" == "auto" ]
then
echo "######################################################################"
echo "$(date) NooBaa: Restarting process (NOOBAA_INIT_MODE=auto)"
echo "######################################################################"
echo -e "\n\n\n"
# will break from the inner loop and re-run the process
elif [ "$mode" == "manual" ]
then
echo "######################################################################"
echo "$(date) NooBaa: Waiting for manual intervention (NOOBAA_INIT_MODE=manual)"
echo "######################################################################"
echo -e "\n"
sleep 10
# will re-enter the inner loop and reload the mode
else
[ ! -z "$mode" ] && echo "NooBaa: unrecognized NOOBAA_INIT_MODE = $mode"
return $rc
fi
done
done
}

noobaa-db-pg-0

1
2
3
4
5
6
# volume
/var/lib/pgsql from db (rw)
db:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: db-noobaa-db-pg-0
ReadOnly: false

serivice

1
2
3
4
5
6
NAME            TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)                                                    AGE
noobaa-db-pg ClusterIP None <none> 5432/TCP 2m15s
noobaa-mgmt ClusterIP 10.97.139.113 <none> 80/TCP,443/TCP,8445/TCP,8446/TCP 2m14s
noobaa-syslog ClusterIP 10.102.80.226 <none> 514/UDP 2m11s
s3 LoadBalancer 10.110.97.124 <pending> 80:32292/TCP,443:32034/TCP,8444:31155/TCP,7004:30222/TCP 2m14s
sts LoadBalancer 10.103.219.22 <pending> 443:30126/TCP 2m13s

volume

1
2
3
4
5
6
noobaa-db-noobaa-db-pg-0
noobaa-noobaa-default-backing-store-noobaa-pvc-07c805ab

# mountpod
dingofs-ubuntu2-pvc-ef31f21c-5677-4f83-94ad-face89b741c3-lhenuc
dingofs-ubuntu3-pvc-10fd093a-0d75-45ee-b52d-84f712378e09-jbxeue

systemc info

1
2
3
4
5
6
7
8
9
10
11
12
13
14
NOOBAA_SECRET=$(kubectl get noobaa noobaa -n noobaa -o json | jq -r '.status.accounts.admin.secretRef.name' )
noobaa-admin

NOOBAA_MGMT=$(kubectl get noobaa noobaa -n noobaa -o json | jq -r '.status.services.serviceMgmt.nodePorts[0]' )
https://10.xx.xx.18:0

NOOBAA_S3=$(kubectl get noobaa noobaa -n noobaa -o json | jq -r '.status.services.serviceS3.nodePorts[0]' )
https://10.xx.xx.18:30478 # entrypoint

NOOBAA_ACCESS_KEY=$(kubectl get secret noobaa-admin -n noobaa -o json | jq -r '.data.AWS_ACCESS_KEY_ID|@base64d')
3gR59xxxxxHBbOdOx

NOOBAA_SECRET_KEY=$(kubectl get secret noobaa-admin -n noobaa -o json | jq -r '.data.AWS_SECRET_ACCESS_KEY|@base64d')
zu229xxxxxxxxxxxxxxxxxxxxxxxxcfX

Mgmt UI

1
2
3
4
5
6
7
8
9
10
11
kubectl get secret noobaa-admin -n noobaa -o json | jq '.data|map_values(@base64d)'
# print
{
"AWS_ACCESS_KEY_ID": "3gR59eoLoBvhjHBbOdOx",
"AWS_SECRET_ACCESS_KEY": "zu229aAJIo6g9GOsjKTyUc6p3Rc0c/nHGdjI3cfX",
"email": "admin@noobaa.io",
"password": "7JfrgTXgnJjU4ywSFWYr+Q==",
"system": "noobaa"
}

open $NOOBAA_MGMT

aws-cli

1
2
3
4
alias s3='AWS_ACCESS_KEY_ID=3gR5xxxxxxxxOx AWS_SECRET_ACCESS_KEY=zu229aAJxxxxxxxxxxxxxxxxxcfX aws --endpoint https://10.xxx.xx.18:30478 --no-verify-ssl s3'
s3 ls
s3 sync /var/log/ s3://first.bucket
s3 ls s3://first.bucket

NSFS

1. create NSFS resource

1
noobaa namespacestore create nsfs dingofs --pvc-name='noobaa-nsfs-pvc'

nsfs-dingofs-pvc.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: nsfs-dingofs-pvc
namespace: noobaa
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 100Gi
volumeMode: Filesystem
storageClassName: dingofs-sc-s3

delete

1
noobaa namespacestore delete dingofs-nsfs

list

1
noobaa namespacestore list

2. create bucket (optional)

该步骤主要用于设置文件系统已有的目录为 bucket ,如果已使用 s3-user-dingofs mb s3://dingofs-bucket-1 在文件系统中创建目录,则无需执行此步骤(否则会提示bucket已存在)。后续直接使用 s3-user-dingofs 进行 bucket 的操作即可

1
2
3
4
5
6
7
8
# 映射文件系统的目录 dingofs-bucket-1 为 bucket
noobaa api bucket_api create_bucket '{
"name": "dingofs-bucket-1",
"namespace":{
"write_resource": { "resource": "dingofs", "path": "dingofs-bucket-1/" },
"read_resources": [ { "resource": "dingofs", "path": "dingofs-bucket-1/" }]
}
}'

status

1
noobaa bucket status <bucketName>

list bucket

1
2
3
noobaa api bucket_api list_buckets '{}'
or
noobaa bucket list

get_bucket_policy

1
noobaa api bucket_api get_bucket_policy '{"name": "<bucketName>"}'

delete bucket

1
2
3
noobaa api bucket_api delete_bucket '{"name": "<bucketName>"}'
or
noobaa bucket delete <bucketName>

add bucket policy

1
2
# 设置bucket访问策略只能使用 admin 账号
AWS_ACCESS_KEY_ID=3gR59eoxxxxBbOdOx AWS_SECRET_ACCESS_KEY=zu229aAJIxxxxxxxxxxxdjI3cfX aws --endpoint-url=https://10.220.32.18:30478 --no-verify-ssl s3api put-bucket-policy --bucket dingofs-bucket-1 --policy file://policy.json

policy.json

1
2
3
4
5
6
7
8
9
10
11
12
{
"Version":"2025-06-25",
"Statement":[
{
"Sid":"id-1",
"Effect":"Allow",
"Principal":"*",
"Action":["s3:*"],
"Resource":["arn:aws:s3:::*"]
}
]
}

Sid (Statement ID) serves as an optional identifier for individual policy statements, allowing for easier management and referencing of specific rules within a larger policy. It’s a way to give a unique name to each permission set within the bucket policy, which can be helpful when dealing with multiple statements or when debugging policy issues

3. create account

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
noobaa api account_api create_account '{
"email": "dingofs@zetyun.com",
"name": "dingofs",
"has_login": false,
"s3_access": true,
"default_resource": "dingofs",
"nsfs_account_config": {
"uid": 0,
"gid": 0,
"new_buckets_path": "/",
"nsfs_only": false
}
}'

# print
access_keys:
access_key: UanhXoxxxxxIggJP
secret_key: ptsSJUYxxxxxxxx2SF8ltV
token: eyJhbGcixxxxxxxxxxxxxxxxxxxxxxxxxxxxxqGdB-E2zQF-3MBII
  • check account
    1
    2
    3
    noobaa api account_api list_accounts {}
    # check specify accout
    noobaa api account_api read_account '{"email":"jenia@noobaa.io"}'

4. config s3 client

1
alias s3-user-dingofs='AWS_ACCESS_KEY_ID=UanhxxxxxxggJP AWS_SECRET_ACCESS_KEY=ptsSJUYVCxxxxxxxxxxxSF8ltV aws --endpoint https://10.220.32.18:30478  --no-verify-ssl s3'

5. operate

aws s3

1
2
3
4
5
6
7
8
# create bucket
s3-user-dingofs mb s3://<bucketName>

# copy object
s3-user-dingofs cp <file> s3://<bucketName>/

# list object
s3-user-dingofs ls s3://<bucketName>/

mc

1
2
mc alias set <aliasName> <entrypoint> <ak> <sk> --insecure
mc ls <aliasName> --insecure

Best Practices

External Postgresql DB (TBD)

  • binary install

    1
    2
    3
    4
    5
    6
    sudo dnf install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-9-x86_64/pgdg-redhat-repo-latest.noarch.rpm
    sudo dnf -qy module disable postgresql
    sudo dnf install -y postgresql17-server
    sudo /usr/pgsql-17/bin/postgresql-17-setup initdb
    sudo systemctl enable postgresql-17
    sudo systemctl start postgresql-17
  • docker install

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    # 17 version
    docker run --name noobaa-postgres \
    -e POSTGRES_PASSWORD=<> \
    --restart always \
    --network host \
    -d dockerproxy.zetyun.cn/docker.io/postgres:latest

    # 15 version
    docker run --name noobaa-postgres \
    -e POSTGRESQL_ADMIN_PASSWORD=<> \
    --restart always \
    --network host \
    -d quay.io/sclorg/postgresql-15-c9s:latest

    # enter psql
    psql -U postgres

    # create nbcore database
    CREATE DATABASE nbcore WITH LC_COLLATE = 'C' TEMPLATE template0;

    # check
    \list

    # db url
    postgres://postgres:<mysecretpassword>@<ip>:5432/nbcore"

    use postgres 15, should config

    variables:

    POSTGRESQL_USER POSTGRESQL_PASSWORD POSTGRESQL_DATABASE

    Or the following environment variable:

    POSTGRESQL_ADMIN_PASSWORD

    Or both.

Troubleshooting

entrypoint的loadbalance

TODO

验证 install 时候使用 –db-storage-class 和 –pv-pool-default-storage-class 指定默认sc

不设置 storageclass.kubernetes.io/is-default-class: “true” 的 default sc