First you find a pid. Then you run
fprof:trace([start, {file, "f"}, {procs, [pid(0,1292,0)]}]).
and let it cook for a while. "f" will contain the profile raw data. When it has cooked for some time, you can run
fprof:trace(stop).
to close the file and stop the trace. Then you read in the data with
fprof:profile(file, "f").
which builds a database in memory from the profile run. The analysis output is then generated in 150 columns with
fprof:analyse([{dest, "f.analysis"}, {cols, 150}]).
to the file "f.analysis".
To understand the profile output, you can read the Tools Users Guide which has a chapter on fprof and describes the output format. Let the CPU-cycle hunt begin!