Hello, Ehsan. You can self-host the deep-seek models using Ollama.
1. Challenges: If the available GPU does not have sufficient VRAM, hosting the larger models will no longer be possible. In that case, smaller models are the only available option, which can create difficulties if the tasks are too complex. We can also use the model's API, which raises privacy and budget issues.
2. Qualitiativly, not that much. However, using them can significantly reduce time if the research workflow has no repetitive processes.
3. Follow the documentation. You can adjust related parameters/configurations later.