Skip to main content

Gemini - Pytorch

Project description

Multi-Modality

Gemini

gemini

The open source implementation of Gemini, the model that will "eclipse ChatGPT", it seems to work by directly taking in all modalities without an encoder for some kind which means that the encoding is built into the modal.

input sequences {texts, audio, imgs, video} -> [tokens] -> transformer -> conditional decoding for img gen

This architecture looks very similiar to Fuyu's architecture just extended to many modalities, where instead of an vit encoder you just pass in the img embeddings into the transformer.

The token inputs to gemini will most likely be denoted by special modality tokens [IMG] or <img> or [AUDIO] or <audio>

Codi also has conditional generation leverages the tokenized outputs.

To implement this, I plan to cover the img embedding first make sure that works well and then go onto the audio embeddings and then the video.

References

  • Combine Reinforcment learning with modular pretrained transformer, multi-modal capabilities, image, audio,
  • self improving mechanisms like robocat
  • PPO? or MPO
  • get good at backtracking and exploring alternative paths
  • speculative decoding
  • Algorithm of Thoughts
  • RLHF
  • Gemini Report
  • Gemini Landing Page

Todo

  • Implement the img feature embedder and align imgs with text and pass into transformer

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gemini_torch-0.0.1.tar.gz (18.6 kB view details)

Uploaded Source

Built Distribution

gemini_torch-0.0.1-py3-none-any.whl (18.3 kB view details)

Uploaded Python 3

File details

Details for the file gemini_torch-0.0.1.tar.gz.

File metadata

  • Download URL: gemini_torch-0.0.1.tar.gz
  • Upload date:
  • Size: 18.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.3.2 CPython/3.11.0 Darwin/22.4.0

File hashes

Hashes for gemini_torch-0.0.1.tar.gz
Algorithm Hash digest
SHA256 c90b479b21607eefc1b37824fb0e675481ada44b27f2ae07bf3da28bc361c6f2
MD5 195661e0fb01d01ed1a37af9e0ed1058
BLAKE2b-256 0d6210846f14aafc84b04f609c421b2d935b718fd2fe28f9959dd0c39176b08c

See more details on using hashes here.

File details

Details for the file gemini_torch-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: gemini_torch-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 18.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.3.2 CPython/3.11.0 Darwin/22.4.0

File hashes

Hashes for gemini_torch-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 730e449880d9635e395a33ad6daa42c5a2b5a93b941d45e0c50f390648a58fc4
MD5 95b8acefc0d27c5e2f9ab838dcfdf25d
BLAKE2b-256 f1e0018d6903a968004e2f66a8c6c65d68ac5a4c89bb3579ac0d31aa1964896e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page