- Creations
- Granite Compiled Inference via llama.cpp
Granite Compiled Inference via llama.cpp
Created with Other

Introduction
Granite Compiled Inference via llama.cpp enables efficient, compiled inference of GGUF models using the powerful llama.cpp library.
Project Overview
Developed as part of the Quantum Proximity Gateway (QPG) project at UCL, this solution integrates facial recognition, proximity detection, post-quantum encryption, and AI-powered device personalisation to deliver a secure, accessible, and intelligent user experience.
Prerequisites
- CMake
- C++ Compiler (g++, clang)
- GGUF model file (e.g., Granite 3.2 8B Instruct)
Installation and Setup
- Clone this repository.
- Clone llama.cpp into the same directory:
git clone https://github.com/ggml-org/llama.cpp
- Build llama.cpp:
cd llama.cpp cmake -B build cmake --build build --config Release
- Compile the inference program:
clang++ -std=c++11 -I./llama.cpp/include -I./llama.cpp/ggml/include main.cpp ./llama.cpp/build/bin/libllama.dylib -o gguf_infer -pthread -Wl,-rpath,./llama.cpp/build/bin
- Run the program:
./gguf_infer <model-path.gguf>
Why Quantum Proximity Gateway?
QPG eliminates the need for manual logins, offering seamless, intelligent, and secure device access using facial recognition and proximity detection, underpinned by post-quantum encryption.
About this Creation
Granite GGUF model compiled inference using llama.cpp as part of the Quantum Proximity Gateway project. Created by Marwan Yassini Chairi El Kamel, Raghav Awasthi, Abdulhamid Abayomi, Abdul Muhaymin Abdul Hafiz.