Indoor Localization can be archieved by using passive, active or both integrated (hybrid) solution. The location of receivers and in the case of an active technology the Received Signal Strength Indication (RSSI) values of the beacons are relevant. A hybrid system has been developed by us that is in use in hospitals and in production environments. In a case of interest, I may send you further material.
Various modalities can help in indoor localization: RF fingerprints (Wifi, GSM, etc.), motion data from body-worn IMUs, signals from pre-deployed beacons (UWB, RFID, …), ambient fingerprints (sound, light, etc.), wall maps of the area of interest, video recording, etc. In the end it depends on your specific requirements, in particular the required accuracy, installation/maintenance effort, and what exactly you want to track (workers? objects?). Typically, systems that build on pre-deployed infrastructure are very accurate (in the cm range), but expensive/tricky to install and maintain. Vice-versa, fingerprinting methods are not very precise (~10m), but fast to deploy, with tools available online.
When you say tracking algorithm, do you mean for human motion? Various sensing modalities can be used to detect human-object interactions, as Michael Hardegger outlined above. If you are more interested in the actual human motions, computer vision may be a useful approach. Colour-based cameras have a good field of view but potentially lack in accuracy. Depth cameras (and existing SDKs) can give you great body part tracking but are generally much more restricted in space. I think in all cases, computer vision is not easily expandable though, so if you are seeking high accuracy in one or two applications it is a good approach. If scalability is a big concern, interaction-based sensing may be better.