Robusta KRR - 一个优化 Kubernetes 的资源分配工具

2023-05-31

cpu pod krr

RobustaKRR（KubernetesResourceRecommender）是一个用于优化Kubernetes集群中资源分配的命令行工具，它从Prometheus收集pod使用数据，并建议CPU和内存的requests和limits值，这可以大大降低成本并提高性能。特征无需代理：Robusta

Robusta KRR（Kubernetes Resource Recommender）是一个用于优化 Kubernetes 集群中资源分配的命令行工具，它从 Prometheus 收集 pod 使用数据，并建议 CPU 和内存的 requests 和 limits 值，这可以大大降低成本并提高性能。

特征

无需代理：Robusta KRR 是一个在本地机器上运行的 CLI 工具，它不需要在你的集群中运行 Pods。
Prometheus 集成：使用内置的 Prometheus 查询收集资源使用数据，自定义查询支持也即将推出。
可扩展策略：轻松创建和使用你自己的策略来计算资源推荐。
未来支持：即将推出的版本将支持自定义资源（例如 GPU）和自定义指标。

根据 Sysdig 最近的一项研究(https://sysdig.com/blog/millions-wasted-kubernetes/)，平均而言，Kubernetes 集群有：

69％未使用的 CPU
18％未使用内存

通过使用 KRR 调整容器大小，你可以平均节省 69% 的云成本。

如果你使用 Robusta SaaS，从 v0.10.15 开始回集成 KRR，你可以查看所有建议（也包括以前的建议），按集群、命名空间或名称过滤和排序它们。

工作原理

指标收集

Robusta KRR 使用以下 Prometheus 查询来收集使用数据：

CPU 使用：sum(irate(container_cpu_usage_seconds_total{{namespace="{object.namespace}", pod="{pod}", cnotallow="{object.container}"}}[{step}]))。
内存使用：sum(container_memory_working_set_bytes{job="kubelet", metrics_path="/metrics/cadvisor", image!="", namespace="{object.namespace}", pod="{pod}", cnotallow="{object.container}"})。

算法

默认情况下，KRR 使用一个简单的策略来计算资源推荐。它的计算方法如下（确切的数字可以在 CLI 参数中自定义）：

对于 CPU，将请求设置为第 99 个百分位数，并且没有限制。这意味着，在 99%的情况下，您的 CPU 请求是足够的。对于剩余的 1％，我们没有设置限制。这意味着您的 Pod 可以突发并使用节点上可用的任何 CPU - 例如其他 Pod 请求但现在未使用的 CPU。
对于内存，使用过去一周内最大值并添加 5％缓冲区。

安装使用

MacOS/Linux 用户可以使用 brew 进行一键安装：

brew tap robusta-dev/homebrew-krr
brew install krr1.
2.

安装完成后可以执行下面的命令来检查是否安装成功：

krr --help # 第一次可能会花较长时间1.

如果想要手动进行安装，则首先确保在你的机器上安装了 Python 3.9 或以上版本。然后 Clone 代码：

git clone https://github.com/robusta-dev/krr
cd krr1.
2.

安装依赖：

pip install -r requirements.txt1.

最后，运行下面的命令来运行工具：

python krr.py --help1.

请注意，使用源代码需要您作为 python 脚本运行，当使用 brew 安装时允许运行 krr。以上所有示例都将运行命令显示为 krr ...，如果您使用的是手动安装，请将其替换为 python krr.py ...。

安装完成后就可以来使用 KRR 工具了，比如可以运行一个简单的策略：

krr simple1.

如果你只需要特定的命名空间（default 和 ingress-nginx）：

krr simple -n default -n ingress-nginx1.

默认情况下，krr 将在当前上下文中运行，如果你想在不同的上下文中运行它：

krr simple -c my-cluster-1 -c my-cluster-21.

如果想获得 JSON 格式的输出（需要 --logtostderr，这样就不会将日志转到结果文件）：

krr simple --logtostderr -f json > result.json1.

如果你想获得 YAML 格式的输出：

krr simple --logtostderr -f yaml > result.yaml1.

如果您想查看其他调试日志：

krr simple -v1.

关于策略设置的更多信息，可以通过以下方式找到:

krr simple --help1.

默认情况下，KRR 将尝试通过扫描下面的这些标签来自动发现正在运行的 Prometheus：

"app=kube-prometheus-stack-prometheus"
"app=prometheus,compnotallow=server"
"app=prometheus-server"
"app=prometheus-operator-prometheus"
"app=prometheus-msteams"
"app=rancher-monitoring-prometheus"
"app=prometheus-prometheus"1.
2.
3.
4.
5.
6.
7.

如果这些标签都没有找到 Prometheus，则将收到错误消息，那么就必须显式传递 url 了（使用 -p 标志）。

如果你的 prometheus 没有自动连接，我们可以使用 kubectl port-forward 手动转发 Prometheus。

例如有一个名为 kube-prometheus-st-prometheus-0 的 Prometheus Pod，则我们可以下面的命令对其进行端口转发：

kubectl port-forward pod/kube-prometheus-st-prometheus-0 90901.

然后，打开另一个终端并在其中运行 krr，给出一个显式的 prometheus url:

krr simple -p http://127.0.0.1:90901.

此外我们还可以根据自己的需求来创建自定义的策略，比如下面的代码就是创建一个自定义的策略：

# This is an example on how to create your own custom strategy
import pydantic as pd
import robusta_krr
from robusta_krr.api.models import HistoryData, K8sObjectData, ResourceRecommendation, ResourceType, RunResult
from robusta_krr.api.strategies import BaseStrategy, StrategySettings
# Providing description to the settings will make it available in the CLI help
class CustomStrategySettings(StrategySettings):
    param_1: float = pd.Field(99, gt=0, descriptinotallow="First example parameter")
    param_2: float = pd.Field(105_000, gt=0, descriptinotallow="Second example parameter")
class CustomStrategy(BaseStrategy[CustomStrategySettings]):
    """
    A custom strategy that uses the provided parameters for CPU and memory.
    Made only in order to demonstrate how to create a custom strategy.
    """
    def run(self, history_data: HistoryData, object_data: K8sObjectData) -> RunResult:
        return {
            ResourceType.CPU: ResourceRecommendation(request=self.settings.param_1, limit=None),
            ResourceType.Memory: ResourceRecommendation(request=self.settings.param_2, limit=self.settings.param_2),
        }
# Running this file will register the strategy and make it available to the CLI
# Run it as `python ./custom_strategy.py my_strategy`
if __name__ == "__main__":
    robusta_krr.run()1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.