Local AI

Has anyone used local AI? I configured vLLM and registered it in Wappler. Wappler detects the model, but when I make a request, it returns:

400 "auto" tool choice requires --enable-auto-tool-choice and --tool-call-parser to be set

In the vllm terminal i see: “POST /v1/chat/completions HTTP/1.1” 400 Bad Request

I installed and configured one of the NIVIDA Spark boxes. Loaded several different LLMs on it and was able to successfully communicate with it through wappler API calls.

Might want to post what your api call looks like? You know is your authentication configured correctly? Any firewall blocking issues. It could be all kinds of issues. We just need a little more information on what you are doing exactly.

I ran the following on the server:
python3 -m vllm.entrypoints.openai.api_server --model Qwen/Qwen2.5-72B-Instruct-AWQ --dtype float16 --quantization awq_marlin --gpu-memory-utilization 0.90 --max-model-len 32768 --host 10.80.3.3 --port 8000

In Wappler, I set up a Custom AI Provider and specified the base URL http://10.80.3.3:8000/v1/

In the AI chat in Wappler, I typed “Hi.” That's it. I'm getting an error.

Everything works fine in the WebUI.

btw, If the path to the AI server is specified incorrectly, Wappler freezes at startup (and sends a report about this to the server after crash). I had to manually edit the options.json file to get Wappler to start.

When the LLM use an Open AI compatible API it should work fine.

I believe the server you've setup is not configured correctly, you should set those 2 required options from the error message. I have no experience with vLLM, you should check their documentation on how to configure.

This should not happen, will investigate it.

1 Like

I fixed the VLLM, so now I can type “Hi” into the Wappler AI assistant, but after that, Wappler tries to send more than 32,768 tokens to the model, and the model crashes. I increased the model's input token limit to 65,535, and it started working. However, the model returned an error; though it did warn right away that it only works with 32k tokens. Now it’s clear why Wappler burned through $20 in ChatGPT in 15 minutes :slight_smile: :slight_smile: :slight_smile:

I have an RTX PRO 6000 with 96 GB of memory. Is this card powerful enough to run the LLM for Wappler? Or do I need 8 B300s? :wink:

I’ll keep experimenting and let you know when something works.

The initial message includes the system prompt which contains all the base knowledge and guidelines for the open project that the LLM should know. If the LLM backend does the caching correctly the follow up calls should cost a lot less and with a local model it doesn't cost anything except GPU power and memory. The needed GPU depends on the LLM model that is being used.

1 Like

I'm not able to reproduce the issue, when having an invalid url/path to the AI server it doesn't freeze or crash.

I looked up the error reports that are send to sentry and noticed that you had several crashes with Out of Memory error. I didn't see in the data that much memory was used, I believe it was around 500Mb and enough free memory. Also didn't notice any loops or so in the stack trace. Can't say what exactly caused it, I've counted 4 of these errors and they seem to happen when the V8 engine is doing Garbage Collection.

Will do some more testing but not sure if the Out Of Memory crashes and incorrect AI url are related.

i have 32Gb RAM (i think about half is free). When url was just ip:worng port Wappler crashing after i use ip:port/v1/ everything is ok.

I tried again. I didn't run LLM at all. In Wappler, I entered the wrong port in the URL— I put 8080 instead of 8000. I closed Wappler. I opened it again. Wappler froze and started using 16% of the CPU; memory usage jumped from 2 GB to about 7 GB in roughly 2–3 minutes, and Wappler threw an error and sent you a report. Not a maximum memory was used. Still 30% free.

I set the port back to the correct one, and Wappler started running. I'm not sure if vLLM is working right now because it's currently downloading a new model, and I don't think it should be responding to API requests, but I'm not sure.

I tried to talk to the AI agent, but it returned a connection error. The LLM is definitely not running. It turns out that in this case, Wappler freezes if the port in the URL is incorrect in the settings.

Checked error reports and have a new error now, it seems that in your case it indeed comes in a loop with the AI manager. I will try to fix it for the next update.

1 Like

I'm happy to help :slight_smile:

Qwen3-32B worked with Wappler, but when I gave it a sufficiently large task, the LLM crashed. Next, I used Qwen3-coder-30B-A3B-Instruct and gave it the same task (an 18-KB text description of a 10-page website). After that, Wappler wrote:

"I'll create a modern website structure for the IT business company based on your requirements. Let me start by examining the current project structure and then implement the requested pages.

First, let me check what files already exist in the project:"

and then went silent. There were no further requests to LMM.

No errors, but nothing is happening.

However, such a large request using ChatGPT wasn’t the best idea; I’ll continue exploring how this can be used....

Is there anywhere I can view the log of how Wappler communicates with the LLM? So I can understand why it stopped working....

UPD:
I tried talking to the AI again:

"- Are you sleeping?

  • I'm awake and ready to help you create the website structure. Let me first check what files already exist in your project to understand the current structure."

Wappler AI loaded the LLM for literally 10 seconds and went back to sleep.... :slight_smile:

UPD2:
At least for now, I can chat with Wappler AI; it apologizes and says, “I'll get to work now,” but nothing happens.... :slight_smile:

Does it do any tool calls? Perhaps support for tool calls is not correctly configured.

Requirements for Wappler is supporting a context of minimal 65k and tool calling.

Idk, no errors, everything is ok but nothing happend. I am new in LLM, will think how to check or fix this on llm side. But if some tool are not working properly better to have some message from Wappler too….

I replaced the tool parser in the model and observed the following changes. When I run the “create index page” query, Wappler outputs:

“<function=get_folder> <parameter=folder_path> file:///D:/PRJ/Wappler/AITest/views </tool_call>”

And then nothing else happens. The LLM remains idle.

The issues are clearly with the tool system.

Which LLMs have you used?

configured them through webUI.

llama3.3:70b

llama3.1:latest

gpt-oss:20b

1 Like

i try casperhansen/llama-3.3-70B-instruct-awq.

for prompt “create index page“ Wappler write:
”Failed to create index.ejs,
Failed to edit index.ejs
Failed to read index.ejs”
This is happend if index.ejs opened.

But when i close it wappler AI start working!

Welcome to my app!

it is much better than before :wink: