User Tools

Site Tools


it:ai_llm_guis

User Interfaces to AI LLMs

Introduction

  • there are now many AI Large Language Models (LLMs) and many ways to access these, although many require setting up a payment account
  • some can be set up to be private so you are not sending information into the internet
  • some allow access to a variety of LLMs from the same interface

Free LLMs hosted on the internet

  • Microsoft's Bing AI
    • unfortunately not very useful for clinical queries
  • Perplexity AI free version
    • much better than Bing AI but the Pro paid version offers useful additional features including access to the latest LLMs such as GPT4 Turbo and Claude 2
    • Perplexity's own LLMs are:
      • based on mistral-7b and llama2-70b base models
      • online versions are augmented with in-house web search, indexing, and crawling infrastructure with search index being large, updated on a regular cadence, and uses sophisticated ranking algorithms to ensure high quality, non-SEOed sites are prioritized, while website excerpts, they call “snippets”, are provided to their pplx-online models to enable responses with the most up-to-date information. These are then continually fine tuned to achieve high performance on various axes like helpfulness, factuality, and freshness 1)
    • uses a Groq very fast AI inference engine, can use Llama2 70b, or Mixtral
    • API access is at a small fee per million tokens used

Free LLMs hosted locally on your computer

  • there are a variety of free LLM user interfaces such as:
  • these generally allow access to a local free LLMs such as Mistral Instruct and Mixtral8x7b Instruct models which you can download for free from various hosted websites such as HuggingFace
    • NB. models which are larger than your VRAM will run much slower
  • these avoid your information being sent to the internet but require a powerful computer to run (preferably with at least 16Gb VRAM although 8Gb will suffice for smaller models or if you are happy to wait a few minutes for responses)
  • these may also allow you to privately utilise your own document files to be embedded and used in your queries
  • LLocalSearch

Free LLMs hosted locally on your iPhone

Accessing your home local LLM via your iPhone on the internet

NGrok - Docker - Ollama approach

Pay per use LLMs on the internet

  • these usually charge a per token use fee
  • ChatGPT 4
  • Google Gemini Advanced running on Google Gemini Ultra 1.0 at $US20/mth introduced in Feb 2024
    • faster than GPT-4 and similar speed as Perplexity but Perplexity offers more features

Third party web interfaces to LLMs

  • these often charge a monthly fee but can provide access to various LLMs and often provide additional features
  • eg. Perplexity AI Pro $US20/mth allows important extra features such as choosing from a variety of LLMs (GPT-4, Claude 2.1, Gemini, or Perplexity), GPT-4 Visual, options to select only scientific papers as sources, and over 300 queries/day
    • the free access does not give access to Co-Pilot and only gives you access to Perplexity LLM which does not appear to give as good a response as local Mistral 7B
    • presumably this uses the technology processes outlined in https://www.youtube.com/watch?v=IbOoEJ9N2z8
    • Perplexity AI LLM answer engine was built in 6 months at a cost of under $US4m utilising Megatron LM and open source Ray via Anyscale and the default LLM is their fine tuned version of GPT 3.5 as it is 4x faster and 4x cheaper than GPT-4 and almost as good - see Perplexity AI's CEO talk

Third party API interfaces to LLMs

  • Perplexity.ai - has the added option of up to date web search data via its “online” versions of its models
  • LiteLLM - standardises API calling input/output format to access 100+ LLMs
  • Groq - uses the Groq very fast inference hardware
it/ai_llm_guis.txt · Last modified: 2024/05/11 23:41 by gary1

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki