A Docker Compose stack for running local LLMs on your own hardware with NVIDIA GPU acceleration, a web UI, and one-command model setup via profiles.
Moved to git repository: https://github.com/denji/nginx-tuning
For this configuration you can use web server you like, i decided, because i work mostly with it to use nginx.
Generally, properly configured nginx can handle up to 400K to 500K requests per second (clustered), most what i saw is 50K to 80K (non-clustered) requests per second and 30% CPU load, course, this was 2 x Intel Xeon with HyperThreading enabled, but it can work without problem on slower machines.
You must understand that this config is used in testing environment and not in production so you will need to find a way to implement most of those features best possible for your servers.
| #!/usr/bin/bash | |
| which ansible >/dev/null 2>&1 | |
| if [ $? -ne 0 ]; | |
| then | |
| echo "Installing Ansible..." | |
| sleep 5 | |
| pushd . | |
| cd ~ | |
| pacman -S libyaml-devel python2 tar libffi libffi-devel gcc pkg-config make openssl-devel openssh libcrypt-devel --noconfirm --needed | |
| curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py |
| http://www.oreilly.com/data/free/files/2014-data-science-salary-survey.pdf | |
| http://www.oreilly.com/data/free/files/2015-data-science-salary-survey.pdf | |
| http://www.oreilly.com/data/free/files/Data_Analytics_in_Sports.pdf | |
| http://www.oreilly.com/data/free/files/advancing-procurement-analytics.pdf | |
| http://www.oreilly.com/data/free/files/ai-and-medicine.pdf | |
| http://www.oreilly.com/data/free/files/analyzing-data-in-the-internet-of-things.pdf | |
| http://www.oreilly.com/data/free/files/analyzing-the-analyzers.pdf | |
| http://www.oreilly.com/data/free/files/architecting-data-lakes.pdf | |
| http://www.oreilly.com/data/free/files/being-a-data-skeptic.pdf | |
| http://www.oreilly.com/data/free/files/big-data-analytics-emerging-architecture.pdf |
States. The final frontier. These are the voyages of an enterprising developer. Her eternal mission: to explore strange new techniques, to seek out better ways to engineer for mental models and new design patterns. To boldly go where a few awesome devs have gone before.
So you’ve found our poignant guide to SCXML and surely you’re wondering “Why should I want to go out of my way to use formal state machines?” or something like that. Hopefully this introduction addresses that kind of question.
Moved to git repository: https://github.com/denji/nginx-tuning
For this configuration you can use web server you like, i decided, because i work mostly with it to use nginx.
Generally, properly configured nginx can handle up to 400K to 500K requests per second (clustered), most what i saw is 50K to 80K (non-clustered) requests per second and 30% CPU load, course, this was 2 x Intel Xeon with HyperThreading enabled, but it can work without problem on slower machines.
You must understand that this config is used in testing environment and not in production so you will need to find a way to implement most of those features best possible for your servers.
People
:bowtie: |
😄 :smile: |
😆 :laughing: |
|---|---|---|
😊 :blush: |
😃 :smiley: |
:relaxed: |
😏 :smirk: |
😍 :heart_eyes: |
😘 :kissing_heart: |
😚 :kissing_closed_eyes: |
😳 :flushed: |
😌 :relieved: |
😆 :satisfied: |
😁 :grin: |
😉 :wink: |
😜 :stuck_out_tongue_winking_eye: |
😝 :stuck_out_tongue_closed_eyes: |
😀 :grinning: |
😗 :kissing: |
😙 :kissing_smiling_eyes: |
😛 :stuck_out_tongue: |
