Instructions to use Harikrishnan46624/finetuned_llama2-1.1b-chat with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Harikrishnan46624/finetuned_llama2-1.1b-chat with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Harikrishnan46624/finetuned_llama2-1.1b-chat") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("Harikrishnan46624/finetuned_llama2-1.1b-chat") model = AutoModelForMultimodalLM.from_pretrained("Harikrishnan46624/finetuned_llama2-1.1b-chat") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use Harikrishnan46624/finetuned_llama2-1.1b-chat with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Harikrishnan46624/finetuned_llama2-1.1b-chat" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Harikrishnan46624/finetuned_llama2-1.1b-chat", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Harikrishnan46624/finetuned_llama2-1.1b-chat
- SGLang
How to use Harikrishnan46624/finetuned_llama2-1.1b-chat with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Harikrishnan46624/finetuned_llama2-1.1b-chat" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Harikrishnan46624/finetuned_llama2-1.1b-chat", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Harikrishnan46624/finetuned_llama2-1.1b-chat" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Harikrishnan46624/finetuned_llama2-1.1b-chat", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use Harikrishnan46624/finetuned_llama2-1.1b-chat with Docker Model Runner:
docker model run hf.co/Harikrishnan46624/finetuned_llama2-1.1b-chat
YAML Metadata Warning:The pipeline tag "text2text-generation" is not in the official list: text-classification, token-classification, table-question-answering, question-answering, zero-shot-classification, translation, summarization, feature-extraction, text-generation, fill-mask, sentence-similarity, text-to-speech, text-to-audio, automatic-speech-recognition, audio-to-audio, audio-classification, audio-text-to-text, voice-activity-detection, depth-estimation, image-classification, object-detection, image-segmentation, text-to-image, image-to-text, image-to-image, image-to-video, unconditional-image-generation, video-classification, reinforcement-learning, robotics, tabular-classification, tabular-regression, tabular-to-text, table-to-text, multiple-choice, text-ranking, text-retrieval, time-series-forecasting, text-to-video, image-text-to-text, image-text-to-image, image-text-to-video, visual-question-answering, document-question-answering, zero-shot-image-classification, graph-ml, mask-generation, zero-shot-object-detection, text-to-3d, image-to-3d, image-feature-extraction, video-text-to-text, keypoint-detection, visual-document-retrieval, any-to-any, video-to-video, other
Model Card for TinyLlama-1.1B Fine-tuned on NLP, ML, Generative AI, and Computer Vision Q&A
This model is fine-tuned from the TinyLlama-1.1B base model to provide answers to domain-specific questions in Natural Language Processing (NLP), Machine Learning (ML), Deep Learning (DL), Generative AI, and Computer Vision (CV). It generates accurate and context-aware responses, making it suitable for educational, research, and professional applications.
Model Details
Model Description
This model excels in providing concise, domain-specific answers to questions in AI-related fields. Leveraging the powerful TinyLlama architecture and fine-tuning on a curated dataset of Q&A pairs, it ensures relevance and coherence in responses.
- Developed by: Harikrishnan46624
- Funded by: Self-funded
- Shared by: Harikrishnan46624
- Model Type: Text-to-Text Generation
- Language(s): English
- License: Apache 2.0
- Fine-tuned from: TinyLlama-1.1B
Model Sources
- Repository: Fine-Tuning Notebook on GitHub
- Demo: [Demo Link to be Added]
Use Cases
Direct Use
- Answering technical questions in AI, ML, DL, LLMs, Generative AI, and Computer Vision.
- Supporting educational content creation, research discussions, and technical documentation.
Downstream Use
- Fine-tuning for industry-specific applications like healthcare, finance, or legal tech.
- Integrating into specialized chatbots, virtual assistants, or automated knowledge bases.
Out-of-Scope Use
- Generating non-English responses (English-only capability).
- Handling non-technical, unrelated queries outside the AI domain.
Bias, Risks, and Limitations
- Bias: Trained on domain-specific datasets, the model may exhibit biases toward AI-related terminologies or fail to generalize well in other domains.
- Risks: May generate incorrect or misleading information if the query is ambiguous or goes beyond the model’s scope.
- Limitations: May struggle with highly complex or nuanced queries not covered in its fine-tuning data.
Recommendations
- For critical or high-stakes applications, it’s recommended to use the model with human oversight.
- Regularly update the fine-tuning datasets to ensure alignment with the latest research and advancements in AI.
How to Get Started
To use the model, install the transformers library and use the following code snippet:
from transformers import pipeline
# Load the model
model = pipeline("text2text-generation", model="TinyLlama/TinyLlama-1.1B-Chat-v1.0")
# Generate a response
output = model("What is the difference between supervised and unsupervised learning?")
print(output)
- Downloads last month
- 3
Model tree for Harikrishnan46624/finetuned_llama2-1.1b-chat
Base model
TinyLlama/TinyLlama-1.1B-Chat-v1.0