## Summary
- Adds top-level `permissions: contents: read` to 5 workflows that only
had job-level permissions: `pr_labeler_file`, `pr_labeler_title`,
`tag-external-contributions`, `v03_api_doc_build`,
`auto-label-by-package`
- SHA-pins all 14 third-party actions to full commit SHAs to prevent
supply chain attacks via tag hijacking
## Why
**Missing top-level permissions:** Without an explicit top-level
`permissions` block, workflows inherit the repository/org default token
permissions, which may be overly broad. Adding `contents: read` as the
default restricts the blast radius if a dependency or action step is
compromised.
**SHA pinning:** Mutable tags (`@v1`, `@master`) can be force-pushed by
the action maintainer or an attacker who compromises their account.
Pinning to a full 40-character SHA ensures the exact reviewed code
always runs. Tag comments are preserved for readability.
### Actions pinned
| Action | File(s) |
|--------|---------|
| `pypa/gh-action-pypi-publish` | `_release.yml` (2 uses) |
| `ncipollo/release-action` | `_release.yml` |
| `Ana06/get-changed-files` | `check_diffs.yml` |
| `astral-sh/setup-uv` | `check_diffs.yml`, `uv_setup/action.yml` |
| `CodSpeedHQ/action` | `check_diffs.yml` |
| `google-github-actions/auth` | `integration_tests.yml` |
| `aws-actions/configure-aws-credentials` | `integration_tests.yml` |
| `amannn/action-semantic-pull-request` | `pr_lint.yml` |
| `bcoe/conventional-release-labels` | `pr_labeler_title.yml` |
| `mikefarah/yq` | `v03_api_doc_build.yml` |
| `EndBug/add-and-commit` | `v03_api_doc_build.yml` |
| `peter-evans/create-pull-request` | `refresh_model_profiles.yml` |
## Test plan
- [x] CI passes — all workflows still resolve their actions correctly
- [x] Verify no functional change: SHA refs point to the same code as
the previous tags
---
> This PR was generated with assistance from an AI coding agent as part
of a repository posture check.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This PR updates an outdated GitHub Action version.
- Updated `astral-sh/setup-uv` from `v6` to `v7` in
`.github/actions/uv_setup/action.yml`
Looks like this was missed as part of
https://github.com/langchain-ai/langchain/pull/33457 so hopefully safe
to bring it into alignment.
Mostly adding a descriptive frontmatter to workflow files. Also address
some formatting and outdated artifacts
No functional changes outside of
[d5457c3](d5457c39ee),
[90708a0](90708a0d99),
and
[338c82d](338c82d21e)
Especially helpful for the text splitters tests where we're installing
pytorch (expensive and slow slow slow). Should speed up CI by 5-10 mins.
w/o caches, CI taking 20 minutes 😨
w/ caches, CI taking 3 minutes
It seems the caching action was not always correctly recreating
softlinks. At first glance, the softlinks it created seemed fine, but
they didn't always work. Possibly hitting some kind of underlying bug,
but not particularly worth debugging in depth -- we can manually create
the soft links we need.
- Revert "Temporarily disable step that seems to be transiently failing.
(#10234)"
- Refresh shell hashtable and show poetry/python location and version.
With this PR:
- All lint and test jobs use the exact same Python + Poetry installation
approach, instead of lints doing it one way and tests doing it another
way.
- The Poetry installation itself is cached, which saves ~15s per run.
- We no longer pass shell commands as workflow arguments to a workflow
that just runs them in a shell. This makes our actions more resilient to
shell code injection.
If y'all like this approach, I can modify the scheduled tests workflow
and the release workflow to use this too.
The current timeouts are too long, and mean that if the GitHub cache
decides to act up, jobs get bogged down for 15min at a time. This has
happened 2-3 times already this week -- a tiny fraction of our total
workflows but really annoying when it happens to you. We can do better.
Installing deps on cache miss takes about ~4min, so it's not worth
waiting more than 4min for the deps cache. The black and mypy caches
save 1 and 2min, respectively, so wait only up to that long to download
them.
Using `${{ }}` to construct shell commands is risky, since the `${{ }}`
interpolation runs first and ignores shell quoting rules. This means
that shell commands that look safely quoted, like `echo "${{
github.event.issue.title }}"`, are actually vulnerable to shell
injection.
More details here:
https://github.blog/2023-08-09-four-tips-to-keep-your-github-actions-workflows-secure/
Ternary operators in GitHub Actions syntax are pretty ugly and hard to
read: `inputs.working-directory == '' && '.' ||
inputs.working-directory` means "if the condition is true, use `'.'` and
otherwise use the expression after the `||`".
This PR performs the ternary as few times as possible, assigning its
outcome to an env var we can then reuse as needed.
The previous caching configuration was attempting to cache poetry venvs
created in the default shared virtualenvs directory. However, all
langchain packages use `in-project = true` for their poetry virtualenv
setup, which moves the venv inside the package itself instead. This
meant that poetry venvs were not being cached at all.
This PR ensures that the venv gets cached by adding the in-project venv
directory to the cached directories list.
It also makes sure that the cache key *only* includes the lockfile being
installed, as opposed to *all lockfiles* (unnecessary cache misses) or
just the *top-level lockfile* (cache hits when it shouldn't).
# Add action to test with all dependencies installed
PR adds a custom action for setting up poetry that allows specifying a
cache key:
https://github.com/actions/setup-python/issues/505#issuecomment-1273013236
This makes it possible to run 2 types of unit tests:
(1) unit tests with only core dependencies
(2) unit tests with extended dependencies (e.g., those that rely on an
optional pdf parsing library)
As part of this PR, we're moving some pdf parsing tests into the
unit-tests section and making sure that these unit tests get executed
when running with extended dependencies.