Skip to main content

A language-agnostic LSP client in Python, with a library interface. Intended to be used to build applications around language servers. Currently scubalspy supports language servers for Python, Rust, Java, Go, JavaScript, Ruby, C# and Dart. Originally appeared as part of Monitor-Guided Decoding (https://github.com/microsoft/monitors4codegen)

Project description

PyPI - Version

Scubalspy: LSP client library in Python to build applications around language servers

Introduction

This repository hosts scubalspy, a library developed as part of research conducted for NeruIPS 2023 paper titled "Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Context" ("Guiding Language Models of Code with Global Context using Monitors" on Arxiv). The paper introduces Monitor-Guided Decoding (MGD) for code generation using Language Models, where a monitor uses static analysis to guide the decoding, ensuring that the generated code follows various correctness properties, like absence of hallucinated symbol names, valid order of method calls, etc. For further details about Monitor-Guided Decoding, please refer to the paper and GitHub repository microsoft/monitors4codegen.

scubalspy is a cross-platform library designed to simplify the process of creating language server clients to query and obtain results of various static analyses from a wide variety of language servers that communicate over the Language Server Protocol. It is easily extensible to support any language that has a Language Server and we aim to continuously add support for more language servers and languages.

Language servers are tools that perform a variety of static analyses on code repositories and provide useful information such as type-directed code completion suggestions, symbol definition locations, symbol references, etc., over the Language Server Protocol (LSP). Since LSP is language-agnostic, scubalspy can provide the results for static analyses of code in different languages over a common interface.

scubalspy intends to ease the process of using language servers, by handling various steps in using a language server:

  • Automatically handling the download of platform-specific server binaries, and setup/teardown of language servers
  • Handling JSON-RPC based communication between the client and the server
  • Maintaining and passing hand-tuned server and language specific configuration parameters
  • Providing a simple API to the user, while executing all steps of server-specific protocol steps to execute the query/request.

Some of the analysis results that scubalspy can provide are:

Installation

It is ideal to create a new virtual environment with python>=3.10. To create a virtual environment using conda and activate it:

conda create -n scubalspy_env python=3.10
conda activate scubalspy_env

Further details and instructions on creation of Python virtual environments can be found in the official documentation. Further, we also refer users to Miniconda, as an alternative to the above steps for creation of the virtual environment.

To install scubalspy using pip, execute the following command:

pip install scubalspy

Supported Languages

scubalspy currently supports the following languages:

Code Language Language Server
java Eclipse JDTLS
python jedi-language-server
rust Rust Analyzer
csharp OmniSharp / RazorSharp
typescript TypeScriptLanguageServer
javascript TypeScriptLanguageServer
go gopls
dart Dart
ruby Solargraph
kotlin KotlinLanguageServer

Usage

Example usage:

from scubalspy import SyncLanguageServer
from scubalspy.scubalspy_config import ScubalspyConfig
from scubalspy.scubalspy_logger import ScubalspyLogger
...
config = ScubalspyConfig.from_dict({"code_language": "java"}) # Also supports "python", "rust", "csharp", "typescript", "javascript", "go", "dart", "ruby"
logger = ScubalspyLogger()
lsp = SyncLanguageServer.create(config, logger, "/abs/path/to/project/root/")
with lsp.start_server():
    result = lsp.request_definition(
        "relative/path/to/code_file.java", # Filename of location where request is being made
        163, # line number of symbol for which request is being made
        4 # column number of symbol for which request is being made
    )
    result2 = lsp.request_completions(
        ...
    )
    result3 = lsp.request_references(
        ...
    )
    result4 = lsp.request_document_symbols(
        ...
    )
    result5 = lsp.request_hover(
        ...
    )
    ...

scubalspy also provides an asyncio based API which can be used in async contexts. Example usage (asyncio):

from scubalspy import LanguageServer
...
lsp = LanguageServer.create(...)
async with lsp.start_server():
    result = await lsp.request_definition(
        ...
    )
    ...

The file src/scubalspy/language_server.py provides the scubalspy API. Several tests for scubalspy present under tests/scubalspy/ provide detailed usage examples for scubalspy. The tests can be executed by running:

pytest tests/scubalspy

Use of scubalspy in AI4Code Scenarios like Monitor-Guided Decoding

scubalspy provides all the features that language-server-protocol provides to IDEs like VSCode. It is useful to develop toolsets that can interface with AI systems like Large Language Models (LLM).

Monitor-Guided Decoding

One such usecase is Monitor-Guided Decoding, where scubalspy is used to find results of static analyses like type-directed completions, to guide the token-by-token generation of code using an LLM, ensuring that all generated identifier/method names are valid in the context of the repository, significantly boosting the compilability of generated code. MGD also demonstrates use of scubalspy to create monitors that ensure all function calls in LLM generated code receive correct number of arguments, and that functions of an object are called in the right order following a protocol (like not calling "read" before "open" on a file object).

Scubalspy in other usecases

Frequently Asked Questions (FAQ)

asyncio related Runtime error when executing the tests for MGD

If you get the following error:

RuntimeError: Task <Task pending name='Task-2' coro=<_AsyncGeneratorContextManager.__aenter__() running at
    python3.8/contextlib.py:171> cb=[_chain_future.<locals>._call_set_state() at
    python3.8/asyncio/futures.py:367]> got Future <Future pending> attached to a different loop python3.8/asyncio/locks.py:309: RuntimeError

Please ensure that you create a new environment with Python >=3.10. For further details, please have a look at the StackOverflow Discussion.

Citing Scubalspy

If you're using Scubalspy in your research or applications, please cite using this BibTeX:

@inproceedings{NEURIPS2023_662b1774,
 author = {Agrawal, Lakshya A and Kanade, Aditya and Goyal, Navin and Lahiri, Shuvendu and Rajamani, Sriram},
 booktitle = {Advances in Neural Information Processing Systems},
 editor = {A. Oh and T. Naumann and A. Globerson and K. Saenko and M. Hardt and S. Levine},
 pages = {32270--32298},
 publisher = {Curran Associates, Inc.},
 title = {Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Context},
 url = {https://proceedings.neurips.cc/paper_files/paper/2023/file/662b1774ba8845fc1fa3d1fc0177ceeb-Paper-Conference.pdf},
 volume = {36},
 year = {2023}
}

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scubalspy-0.0.19.tar.gz (123.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scubalspy-0.0.19-py3-none-any.whl (142.7 kB view details)

Uploaded Python 3

File details

Details for the file scubalspy-0.0.19.tar.gz.

File metadata

  • Download URL: scubalspy-0.0.19.tar.gz
  • Upload date:
  • Size: 123.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for scubalspy-0.0.19.tar.gz
Algorithm Hash digest
SHA256 d34420951d7e5f66e7f4cf8465d475a645a71fa0c96c6ff971b301dacc0b65a6
MD5 b304b9f5e2ef62c00705215b2f1404b0
BLAKE2b-256 f4046ba34a6c511120c98bf68dbb67c0a4a1b312889753470545d7e738fd7d50

See more details on using hashes here.

Provenance

The following attestation bundles were made for scubalspy-0.0.19.tar.gz:

Publisher: publish-to-pypi.yaml on SunBK201/multilspy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file scubalspy-0.0.19-py3-none-any.whl.

File metadata

  • Download URL: scubalspy-0.0.19-py3-none-any.whl
  • Upload date:
  • Size: 142.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for scubalspy-0.0.19-py3-none-any.whl
Algorithm Hash digest
SHA256 8374654c021d761b820305371f178c26fc2ff600d53bd5c07693b517a1c0da09
MD5 51da0a4a5314fa4d59f9761b18bec89c
BLAKE2b-256 32cb7eef0bcf1eba5c01984bed66a881f310bae2022d411469fedca4da6827f3

See more details on using hashes here.

Provenance

The following attestation bundles were made for scubalspy-0.0.19-py3-none-any.whl:

Publisher: publish-to-pypi.yaml on SunBK201/multilspy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page