Skip to content

[Bug] v9: cannot create or use a sparse index against Pinecone Local #679

Description

@deekshant-w

Is this a new bug?

  • I believe this is a new bug
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

There is no way to create and use a sparse index against Pinecone Local with the v9 Python SDK. Two problems compound:

  1. Control-plane create is a deadlock. The SDK forbids sending dimension for sparse indexes (client-side validation in validate_create_inputs, and build_create_body never adds it), but the Pinecone Local server requires dimension in the POST /indexes body.

    • Passing dimension → blocked by the SDK before any request is sent (PineconeValueError: dimension must not be provided for sparse indexes).
    • Omitting dimension → server returns 422 ... missing field 'dimension'.
  2. Even when dimension is forced through, you don't get a sparse index. Bypassing the SDK builder and POSTing dimension: 1 directly returns 201 Created, but describe_index reports the index as vector_type='dense' — Pinecone Local silently ignores vector_type: "sparse". Any sparse_values upsert then fails with [400] Vector dimension 0 does not match the dimension of the index 1.

Expected Behavior

pc.create_index(name=..., vector_type="sparse", metric="dotproduct", spec=ServerlessSpec(...)) should create a true sparse index against Pinecone Local (without requiring a dimension), and upsert with sparse_values should succeed — matching Pinecone cloud and the documented sparse-index examples.

If sparse indexes are not yet supported by Pinecone Local, the SDK should surface a clear, actionable error instead of a 422 missing field 'dimension' / 400 dimension mismatch, and the limitation should be documented.

Steps To Reproduce

Start Pinecone Local on http://localhost:5080 via Docker, install pinecone==9.1.0, then:

A. Pass dimension → SDK rejects it client-side

from pinecone import Pinecone, ServerlessSpec
pc = Pinecone(api_key="pclocal", host="http://localhost:5080", ssl_verify=False)
pc.create_index(
    name="sparse-index", vector_type="sparse", metric="dotproduct",
    spec=ServerlessSpec(cloud="aws", region="us-east-1"),
    deletion_protection="disabled", dimension=1,
)

B. Omit dimension → server 422

pc.create_index(
    name="sparse-index", vector_type="sparse", metric="dotproduct",
    spec=ServerlessSpec(cloud="aws", region="us-east-1"),
    deletion_protection="disabled",
)

C. Force dimension via raw POST → creates a dense index, sparse upsert fails

body = {
    "name": "sparse-index", "metric": "dotproduct", "vector_type": "sparse",
    "deletion_protection": "disabled", "dimension": 1,
    "spec": {"serverless": {"cloud": "aws", "region": "us-east-1"}},
}
pc.indexes._http.post("/indexes", json=body)            # 201 Created
desc = pc.describe_index("sparse-index")
print(desc.vector_type, desc.dimension)                 # -> dense 1   (not sparse!)
idx = pc.Index(host=desc.host.replace("https://", "http://"))
idx.upsert(namespace="ns", vectors=[{
    "id": "vec1", "sparse_values": {"values": [1.7, 0.4], "indices": [10, 20]},
}])

D. Disable the SDK's sparse-validation branch, then use the public create_index with dimension=1

To confirm the client-side check is the only thing blocking path A, comment out this branch in validate_create_inputs (pinecone/_internal/indexes_helpers.py):

# if resolved_vt == "sparse" and dimension is not None:
#     raise ValidationError("dimension must not be provided for sparse indexes")

then:

pc.create_index(
    name="sparse-index", vector_type="sparse", metric="dotproduct",
    spec=ServerlessSpec(cloud="aws", region="us-east-1"),
    deletion_protection="disabled", dimension=1,
)                                                       # 201 Created
desc = pc.describe_index("sparse-index")
print(desc.vector_type, desc.dimension)                 # -> dense 1   (still not sparse!)
idx = pc.Index(host=desc.host.replace("https://", "http://"))
idx.upsert(namespace="ns", vectors=[{
    "id": "vec1", "sparse_values": {"values": [1.7, 0.4], "indices": [10, 20]},
}])

The create now succeeds through the public API and dimension=1 reaches the server, but the index is still created as dense and the sparse upsert fails identically to C — confirming the SDK validation is what blocks A, while the dense-only server behaviour is the deeper blocker.

Relevant log output

# A — pass dimension
pinecone.errors.exceptions.PineconeValueError: dimension must not be provided for sparse indexes

# B — omit dimension
pinecone.errors.exceptions.ApiError: [422] Failed to deserialize the JSON body into the target type: missing field `dimension` at line 1 column 197

# C — raw POST then sparse upsert
describe_index -> vector_type='dense', dimension=1
pinecone.errors.exceptions.ApiError: [400] Vector dimension 0 does not match the dimension of the index 1

# D — SDK validation patched out, then create_index + sparse upsert
describe_index -> vector_type='dense', dimension=1
pinecone.errors.exceptions.ApiError: [400] Vector dimension 0 does not match the dimension of the index 1

# also: dimension=0 at create
pinecone.errors.exceptions.ApiError: [400 INVALID_ARGUMENT] Bad request: Invalid dimension: 0. Must be greater than 0 and less than 20,000

# original example run with PineconeGRPC — describe shows vector_type='dense', then the data-plane upsert fails:
describe_index -> vector_type='dense', dimension=1
pinecone.errors.exceptions.PineconeConnectionError: received corrupt message of type InvalidContentType

Environment

- **SDK**: `pinecone==9.1.0`
- **Python**: 3.13
- **OS**: Windows 11
- **Pinecone Local**: `ghcr.io/pinecone-io/pinecone-local:latest``org.opencontainers.image.version=v1.0.0.rc0` (created 2025-02-27), started with `PORT=5080`, ports `5080-5090` exposed.

Additional Context

The simplest example on Local development with Pinecone Local is failing as well, miserably and in numerous ways.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions