Machine Learning - Learning/Language Models

1 readers
1 users here now

Discussion of models, thier use, setup and options.

Please include models used with your outputs, workflows optional.

Model Catalog

We follow Lemmy’s code of conduct.

Communities

Useful links

founded 1 year ago
MODERATORS
1
2
1
LLM Finetuning Risks (llm-tuning-safety.github.io)
submitted 11 months ago by ylai@lemmy.ml to c/models@lemmy.intai.tech
3
4
5
6
7
8
9
10
11
4
Chinchilla’s Death (espadrine.github.io)
submitted 11 months ago by ylai@lemmy.ml to c/models@lemmy.intai.tech
12
13
14
 
 

Abstract

We evaluate the use of the open-source Llama-2 model for generating well-known, high-performance computing kernels (e.g., AXPY, GEMV, GEMM) on different parallel programming models and languages (e.g., C++: OpenMP, OpenMP Offload, OpenACC, CUDA, HIP; Fortran: OpenMP, OpenMP Offload, OpenACC; Python: numpy, Numba, pyCUDA, cuPy; and Julia: Threads, CUDA.jl, AMDGPU.jl). We built upon our previous work that is based on the OpenAI Codex, which is a descendant of GPT-3, to generate similar kernels with simple prompts via GitHub Copilot. Our goal is to compare the accuracy of Llama-2 and our original GPT-3 baseline by using a similar metric. Llama-2 has a simplified model that shows competitive or even superior accuracy. We also report on the differences between these foundational large language models as generative AI continues to redefine human-computer interactions. Overall, Copilot generates codes that are more reliable but less optimized, whereas codes generated by Llama-2 are less reliable but more optimized when correct.

15
16
17
18
19
20
21
22
23
5
submitted 1 year ago* (last edited 1 year ago) by ylai@lemmy.ml to c/models@lemmy.intai.tech
 
 

Corresponding arXiv preprint: https://arxiv.org/abs/2308.03762

24
25
view more: next ›