Language Models

30 Dec 2022 03:47 - 29 Nov 2023 07:10
Open in Logseq
    • ChatGPT
      • Prompt: Explain how GPT3 works, in computational terms.
      • GPT-3 (Generative Pre-trained Transformer 3) is a language generation model developed by OpenAI that uses deep learning techniques to generate human-like text. It is trained on a large dataset of text and uses a transformer neural network architecture to process the input and generate the output. Here's a high-level overview of how GPT-3 works:
        • The model is fed a sequence of input tokens (e.g., words or subwords). These tokens are passed through an embedding layer, which converts them into dense vectors that represent their meanings.
        • The input vectors are then passed through a series of transformer blocks. Each block consists of a self-attention layer and a feed-forward layer. The self-attention layer allows the model to consider the relationships between different tokens in the input, while the feed-forward layer transforms the input using a learned function.
        • The output of the transformer blocks is passed through a final linear layer, which converts it into a probability distribution over the possible output tokens.
        • The model then generates the output by sampling from this probability distribution and adding the sampled tokens to the output sequence.
        • This process is repeated until the model generates the desired number of output tokens, at which point the output sequence is returned.
      • GPT-3 is pre-trained on a large dataset of text, which means that it has already learned to perform a wide range of language tasks by being exposed to a vast amount of data. When it is fine-tuned on a specific task, it is able to adapt to that task quickly because it has already learned many of the underlying patterns and structures of language. This allows it to generate high-quality text that is difficult to distinguish from text written by humans.
    • Ridiculous to have a page on this, things move so fast, but I need to organize my thoughts.
    • How will they change human-computer interaction?
      • A whole different medium than the mouse-and-affordance model. Better for some things, worse for others.
      • Mixed systems could be very powerful, wonder if anyone is exploring them (almost surely)
      • Not actually clear LLMs are good beyond something like Siri or Alexa. Presumably they can handle more complex speech, but
    • OK, question I should know the answer to: what are some ways to have a customized chat machine, Ie trained on a corpus (I could make an artificial me, in fact I am raring to go with that)
      • Introducing GPTs
        • sigh, a paid feature. Was willing to pay but they have a waitlist
      • Notion has a feature for this