Loading...
Loading...
Upgrade flashinfer-python version in TensorRT-LLM. Fetches the latest releases from GitHub (stable and nightly), compares with the current pinned version, lets the user pick a target version, and updates all version references across the repo. Use when the user wants to bump or upgrade flashinfer.
npx skill4agent add nvidia/skills trtllm-flashinfer-upgradeflashinfer-pythonghGITHUB_USERNAME=$(gh api user --jq .login)
echo "$GITHUB_USERNAME"ghGITHUB_USERNAME=$(git remote -v | grep -E 'github\.com/[^/]+/TensorRT-LLM' \
| head -1 | sed -E 's|.*github\.com[:/]([^/]+)/TensorRT-LLM.*|\1|')AskUserQuestiongit remote -v | grep -E 'github\.com/${GITHUB_USERNAME}/TensorRT-LLM'No GitHub fork remote detected. A fork ofis required to push branches and create PRs.NVIDIA/TensorRT-LLM
- Fork the repo at https://github.com/NVIDIA/TensorRT-LLM/fork
- Add it as a git remote:
bashgit remote add fork https://github.com/<GITHUB_USERNAME>/TensorRT-LLM.git- Re-run this skill.
ghghgh auth statusLogged in to github.comreporepoNVIDIA/TensorRT-LLMghbashgh auth loginChoose: GitHub.com → HTTPS → authenticate with a web browser (or paste a PAT withscope).repo
GH_CONFIG_DIRghNVIDIA/TensorRT-LLMghCLAUDE.local.mdAGENTS.mdGH_CONFIG_DIRghGH_CONFIG_DIR=<path> gh ...gh auth statusAskUserQuestionWebFetchhttps://github.com/flashinfer-ai/flashinfer/releasesWebFetchhttps://github.com/flashinfer-ai/flashinfer/releasesv0.6.7v0.7.0.dev20260401curl -s "https://api.github.com/repos/flashinfer-ai/flashinfer/releases?per_page=30" \
| python3 -c "
import json, sys
releases = json.load(sys.stdin)
for r in releases:
tag = r['tag_name']
pre = ' (pre-release)' if r['prerelease'] else ' (stable)'
date = r['published_at'][:10]
print(f'{tag} {date}{pre}')
"requirements.txtgrep flashinfer-python requirements.txtflashinfer-python==X.Y.ZAskUserQuestionsecurity_scanning/poetry.lockmetadata.content-hashsecurity_scanning/poetry.locksecurity_scanning/poetry.lock| File | What to change | Always |
|---|---|---|
| | Yes |
| | Yes |
| | Yes |
| Update | Only if user opted in at Step 3 question 3 |
security_scanning/poetry.lockOnly perform this subsection if the user answered Yes to question 3 in Step 3. Otherwise skip it entirely.
curl -s "https://pypi.org/pypi/flashinfer-python/NEW_VERSION/json" \
| python3 -c "
import json, sys
data = json.load(sys.stdin)
for f in data['urls']:
print(f'{f[\"filename\"]} sha256:{f[\"digests\"][\"sha256\"]}')
"files = [...][[package]] name = "flashinfer-python"[package.dependencies]requires_distsecurity_scanning/pyproject.tomlsecurity_scanning/poetry.lockmetadata.content-hashcd security_scanning && poetry lock --no-update && cd ..poetrypoetry add flashinfer-python@NEW_VERSIONsecurity_scanning/pyproject.tomlpoetry.lock0.7.0.dev20260401curl -s "https://pypi.org/pypi/flashinfer-python/VERSION/json"security_scanning/poetry.lock# TODO: update hashes when published to PyPIrequirements.txtflashinfer-python @ git+https://github.com/flashinfer-ai/flashinfer.git@TAG#egg=flashinfer-pythongrep -rn 'flashinfer.*__version__\|flashinfer.*version' \
tensorrt_llm/ --include="*.py"tensorrt_llm/_torch/speculative/interface.pyflashinfer.__version__ >= "0.6.4"pip install -r requirements.txtpytest tests/unittest/_torch/flashinfer/ -v
pytest tests/unittest/_torch/attention/test_flashinfer_attention.py -vIf the user opted out of theupdate at Step 3 question 3, droppoetry.lockfrom thesecurity_scanning/poetry.lock,git stash, and commit message in the snippets below.git add
# Drop security_scanning/poetry.lock from this list if the user opted out.
git stash push -m "flashinfer-upgrade-wip" -- requirements.txt security_scanning/pyproject.toml security_scanning/poetry.lock ATTRIBUTIONS-Python.md
git checkout main
git pull --rebase https://github.com/NVIDIA/TensorRT-LLM.git main
git checkout -b ${GITHUB_USERNAME}/update_flashinfer_${NEW_VERSION}
git stash popGITHUB_USERNAMEyihwang-nvNEW_VERSION0.6.7.post3# Drop security_scanning/poetry.lock from the `git add` list and the commit
# body if the user opted out.
git add requirements.txt security_scanning/pyproject.toml security_scanning/poetry.lock ATTRIBUTIONS-Python.md
git commit -s -m "[None][chore] Update flashinfer-python from OLD to NEW
Bump flashinfer-python dependency to the latest stable release.
Updated version pins in requirements.txt, security_scanning/pyproject.toml,
security_scanning/poetry.lock (if updated), and ATTRIBUTIONS-Python.md."forkFORK_REMOTE=fork # adjust if the user named their fork remote differently
BRANCH="${GITHUB_USERNAME}/update_flashinfer_${NEW_VERSION}"
git push -u "${FORK_REMOTE}" "${BRANCH}"gh auth statusrepoghGH_CONFIG_DIRNVIDIA/TensorRT-LLMgh pr create \
--repo NVIDIA/TensorRT-LLM \
--base main \
--head "${GITHUB_USERNAME}:${BRANCH}" \
--title "[None][chore] Update flashinfer-python from ${OLD_VERSION} to ${NEW_VERSION}" \
--body "$(cat <<EOF
## Summary
- Bump flashinfer-python from ${OLD_VERSION} to ${NEW_VERSION} (latest stable)
- Updated version pins in requirements.txt, security_scanning/pyproject.toml, and ATTRIBUTIONS-Python.md (and security_scanning/poetry.lock if the user opted in)
## Test plan
- [ ] pip install -r requirements.txt installs successfully
- [ ] pytest tests/unittest/_torch/flashinfer/ -v
- [ ] pytest tests/unittest/_torch/attention/test_flashinfer_attention.py -v
- [ ] CI pre-merge passes
EOF
)"gh pr create| File | Pattern |
|---|---|
| |
| |
| |
| |
setup.py.pre-commit-config.yamlpyproject.tomlflashinfer/flashinfer-python