feat: development

pull/139/head
Yeuoly 2024-07-07 19:08:10 +08:00
parent b62b2df2a9
commit 26e1c17516
No known key found for this signature in database
GPG Key ID: A66E7E320FB19F61
8 changed files with 148 additions and 0 deletions

View File

@ -103,6 +103,11 @@
* [Seek Support](community/support.md)
* [Become a Contributor](community/contribution.md)
## 研发 <a href="#development" id="development"></a>
* [Backend](development/backend/README.md)
* [DifySandbox](development/backend/sandbox/README.md)
## Learn More
* [Use Cases](learn-more/use-cases/README.md)

View File

@ -0,0 +1,2 @@
# Backend Development

View File

@ -0,0 +1,18 @@
# DifySandbox
### Introduction
`DifySandbox` is a lightweight, fast, and secure code execution environment that supports multiple programming languages, including Python and Node.js. It serves as the underlying execution environment for various components in Dify Workflow, such as the Code node, Template Transform node, LLM node, and the Code Interpreter in the Tool node. DifySandbox ensures system security while enabling Dify to execute user-provided code.
### Features
- **Multi-language Support**: DifySandbox is built on Seccomp, a low-level security mechanism that enables support for multiple programming languages. Currently, it supports Python and Node.js.
- **System Security**: It implements a whitelist policy, allowing only specific system calls to prevent unexpected security breaches.
- **File System Isolation**: User code runs in an isolated file system environment.
- **Network Isolation**:
- **DockerCompose**: Utilizes a separate Sandbox network and proxy containers for network access, maintaining intranet system security while offering flexible proxy configuration options.
- **K8s**: Network isolation strategies can be directly configured using Egress policies.
### Project Repository
You can access the [DifySandbox](https://github.com/langgenius/dify-sandbox) repository to obtain the project source code and follow the project documentation for deployment and usage instructions.
### Contribution
Please refer to the [Contribution Guide](./contribution.md) to learn how you can participate in the development of DifySandbox.

View File

@ -0,0 +1,49 @@
# Contribution
### Code Structure
The following code file structure outlines the organization of the project:
```
[cmd/]
├── server // Server startup entry point
├── lib // Shared library entry point
└── test // Common test scripts
[build/] // Build scripts for different architectures and platforms
[internal/] // Internal packages
├── controller // HTTP request handlers
├── middleware // Request processing middleware
├── server // Server setup and configuration
├── service // Controller services
├── static // Configuration files
│ ├── nodejs_syscall // Node.js system call whitelist
│ └── python_syscall // Python system call whitelist
├── types // Entity definitions
├── core // Core isolation and execution logic
│ ├── lib // Shared libraries
│ ├── runner // Code execution
│ │ ├── nodejs // Node.js executor
| | └── python // Python executor
└── tests // CI/CD tests
```
### Principle
The core functionality has two entry points: the `HTTP` service entry for `DifySandbox` and the `dynamic link library` entry. When the Sandbox runs code, it first generates a temporary code file. This file begins by calling the `dynamic link library` to initialize the runtime environment (the `Sandbox`). The user's code is then executed within this temporary file, ensuring that the system remains protected from potentially harmful user-submitted code.
The dynamic link library uses `Seccomp` to restrict system calls. The `static` directory contains `nodejs_syscall` and `python_syscall` files, which provide system call whitelists for both `ARM64` and `AMD64` architectures. There are four files in total. Please do not modify these files unless absolutely necessary.
### How to Contribute
For minor issues like `Typos` and `Bugs`, feel free to submit a `Pull Request`. For major changes or `Feature`-level submissions, please open an `Issue` first to facilitate discussion.
#### To-Do List
Here are some items we're currently considering. If you're interested, you can choose one to contribute:
- [ ] Support for additional programming languages:
- We currently support `Python` and `Node.js`. Consider adding support for new languages.
- Remember to account for both `ARM64` and `AMD64` architectures, and provide `CI` testing to ensure security for any new language.
- [ ] Node.js dependency management:
- We've implemented support for `Python` dependencies, which can be automatically installed during Sandbox initialization. However, due to the complexity of `node_modules`, we haven't yet found a good solution for `Node.js`. This is an area open for improvement.
- [ ] Image processing capabilities:
- As multimodality becomes increasingly important, supporting image processing in the `Sandbox` would be valuable.
- Consider adding support for image processing libraries like `Pillow`, and enable passing images into the `Sandbox` for processing in `Dify`.
- [ ] Enhanced `CI` testing:
- Our current `CI` testing is limited and includes only basic test cases. More comprehensive testing would be beneficial.
- [ ] Multimodal data generation:
- Explore using the `Sandbox` to generate multimodal data, such as combining text and images.

View File

@ -102,6 +102,11 @@
* [寻求支持](community/support.md)
* [成为贡献者](community/contribution.md)
## 研发 <a href="#development" id="development"></a>
* [后端](development/backend/README.md)
* [DifySandbox](development/backend/sandbox/README.md)
## 阅读更多 <a href="#learn-more" id="learn-more"></a>
* [应用案例](learn-more/use-cases/README.md)

View File

@ -0,0 +1,2 @@
# 后端研发

View File

@ -0,0 +1,18 @@
# DifySandbox
### 介绍
`DifySandbox`是一个轻量、快速、安全的代码运行环境,支持多种编程语言,包括`Python`、`Nodejs`等,用户在`Dify Workflow`中使用到的如`Code`节点、`Template Transform`节点、`LLM`节点的Jinja2语法、`Tool`节点的`Code Interpreter`等都基于DifySandbox运行它确保了`Dify`可以运行用户代码的前提下整个系统的安全性。
### 特性
- **多语言支持**`DifySandbox`基于`Seccomp`,这是一个系统层级的解决方案,从而确保了可以支持多种编程语言,目前支持了`Python`与`Nodejs`。
- **系统安全**:使用白名单策略,只允许运行特定的系统调用,从而确保不会出现意外的绕过。
- **文件系统隔离**:用户代码将运行在一个独立的隔离的文件系统中。
- **网络隔离**:
- **DockerCompose**独立网络Sandbox网络并使用代理容器进行网络访问确保内网系统的安全同时提供了灵活的代理配置方案。
- **K8s**:直接使用`Exgress`配置网络隔离策略即可。
### 项目地址
你可以直接访问[DifySandbox](https://github.com/langgenius/dify-sandbox)获取项目源码,并遵循项目文档进行部署和使用。
### 贡献
你可以参考[贡献指南](./contribution.md)来参与到`DifySandbox`的开发中。

View File

@ -0,0 +1,49 @@
# 贡献
### 代码结构
参考如下代码文件结构可以帮助你更好地理解代码的组织方式。
```
[cmd/]
├── server // Enterpoint for starting the server.
├── lib // Enterpoint for Shared libraries.
└── test // Common test scripts.
[build/] // build scripts for different architectures and platforms.
[internal/] // Internal packages.
├── controller // HTTP request handlers.
├── middleware // Middleware for request processing.
├── server // Server setup and configuration.
├── service // Provides service for controller.
├── static // Configuration files.
│ ├── nodejs_syscall // Whitelist for nodejs syscall.
│ └── python_syscall // Whitelist for python syscall.
├── types // Entities
├── core // Core logic for isolation and execution.
│ ├── lib // Shared libraries.
│ ├── runner // Code execution.
│ │ ├── nodejs // Nodejs runner.
| | └── python // Python runner.
└── tests // Tests for CI/CD.
```
### 原理
目前来说,核心逻辑的入口部分有两个,一个是`DifySandbox`的`HTTP`服务入口,另一个是`动态链接库`的入口在Sandbox尝试运行代码时它会首先生产一个临时代码文件在这个文件的最开始会调用`动态链接库`来初始化运行环境,也就是`Sandbox`,随后才是用户代码的执行,最终执行代码时并不会直接执行用户提交的代码,而是执行这个临时文件,从而确保不会被用户提交的代码破坏系统。
其中,动态链接库中就是使用了`Seccomp`来限制系统调用,其中运行的系统调用位于`static`目录下的`nodejs_syscall`和`python_syscall`文件中,并分别提供了`ARM64`和`AMD64`两种架构的系统调用白名单,一共四份文件,在没有特殊需求的情况下,请不要随意修改这些文件。
### 如何贡献
首先,对于`Typo`、`Bug`等问题,欢迎直接提交`PR`,如果是较大的改动或`Feature`级别的提交,请先提交`Issue`,以便我们更好地讨论。
#### 待办事项
这里是一些我们目前正在考虑的待办事项,如果你有兴趣,可以选择其中一个来贡献。
- [ ] 新编程语言的支持:
- 我们目前支持`Python`和`Nodejs`,如果你有兴趣,可以尝试添加新的编程语言支持。
- 请注意,每新增一个语言,都要同时考虑`ARM64`和`AMD64`两种架构,并提供`CI`测试,以便我们能确保新增语言的安全性。
- [ ] Nodejs的依赖问题
- 目前我们仅完成了`Python`的依赖支持,可以在`Sandbox`初始化时自动安装`Python`依赖,但对于`Nodejs`,由于其`node_modules`的复杂性,我们目前还没有找到一个很好的解决方案,如果你有兴趣,可以尝试解决这个问题。
- [ ] 图片处理:
- 在未来、多模态是一个必然的趋势,因此在`Sandbox`支持处理图片也是一个很有意义的事情。
- 你可以尝试添加对图片处理的支持,比如`Pillow`等库的支持,并支持在`Dify`中传入图片到`Sandbox`中进行处理。
- [ ] 完善的`CI`测试:
- 目前我们的`CI`测试还不够完善,只有一些简单的用例。
- [ ] 生成多模态数据:
- 尝试使用`Sandbox`生成多模态数据,比如文本和图片的组合数据。