Discussion:
KST Speed
Erik Li
2018-05-05 05:12:29 UTC
Permalink
Hi all,

What makes KST so fast? I've tried to look at the source but I'm having a
hard time getting a high-level overview of what is going on. I see
interpolation being defined in libkst/vector.cpp and can see some places
where it's called in libkstmath/curve.cpp, but the way that NS (num
samples) is being set would suggest to me that the interpolation is in fact
returning the raw data. My naive thought of how this should work is that
you essentially subsample the data to plot a smaller portion of the data,
but the accuracy with which kst plots things at different scales somewhat
belies this.

I've been super impressed by the application and it's responsiveness. For
reference, I am trying to analyze data from electrophysiological
experiments, which for my workflow means time series datasets of around 1.5
to 15 million datapoints. Kst handles this like a champ with quick
panning/zooming. Even more impressive for me is that it maintains good
temporal resolution for the action potentials and doesn't cut them short
even when zoomed out. (The signals that I look at basically run from -60 to
-50 mV most of the time and then periodically "spike" to 20 mV for 1 ms at
a time. Those spikes are action potentials and the exact timing is pretty
important to my analyses)

We have an analysis workflow that uses R or python, but the plotting speed
takes sometimes tens of seconds to plot the data. I am trying to make a
small application that will allow reasonable display of the data and be
responsive to changes in analysis parameters so that you can actually
"play" with the data in an exploratory fashion.

I see two paths forward for me:
- Implement my analysis routines in kst using the plugin framework
- in this case I would want to capture mouse events on the plot also.
Are there bindings for mouse/keyboard events on the plot?
- Implement parts of the KST plotting algorithm for curves in R or python

I'm happy to try to move forwards in whatever way is going to make the most
sense, but in either case I would be really interested in getting a sense
of the algorithm that Kst is using to plot curves because it seems like
there is some quite sophisticated stuff going on here. I would have thought
fast plotting would be a broadly solved problem, but really in the open
source space Kst is the only solution I've found that works at this scale.
(Maybe also gnuplot, but I haven't explored it very carefully yet).

Thank you in advance for any insight or pointers on places in the source
that I should be looking at! And if this is explained somewhere on the
website or in a blog post sorry for my cluelessness.

-Erik
Barth Netterfield
2018-05-10 02:47:47 UTC
Permalink
Hi, Erik,

Sorry for the slow reply - KST may be fast, but apparently I'm not always.

If the X axis vector is monotonically increasing, then the curve drawing
is optimized:
  Noting that the X resolution is finite (especially compared to a very
large vector).
  Noting that one only needs to draw a line from the minimum value in a
column of pixels to the maximum value in the row of pixels in order to
not miss anything.

and then doing the obvious thing.  Since looping through points and
doing compares is far faster than drawing lots and lots of lines, the
drawing gets way way faster.

As to your problem:  You could consider using pykst.  You would do all
of your math and manipulation in python, as you do now, then plot it in kst.

You can plot numpy arrays in kst.  It takes about 0.14s to copy a
15,000,000 point numpy array into a running kst session and plot it. 
You should be able to overwrite and re-draw the vector in kst at will,
but still have a fully live kst session to zoom and scroll, etc.  pykst
has full control of most of the kst session, so this should actually
work pretty well I think.

Take a look at https://kst-plot.kde.org/pykst/

cbn
Post by Erik Li
Hi all,
What makes KST so fast? I've tried to look at the source but I'm
having a hard time getting a high-level overview of what is going on.
I see interpolation being defined in libkst/vector.cpp and can see
some places where it's called in libkstmath/curve.cpp, but the way
that NS (num samples) is being set would suggest to me that the
interpolation is in fact returning the raw data. My naive thought of
how this should work is that you essentially subsample the data to
plot a smaller portion of the data, but the accuracy with which kst
plots things at different scales somewhat belies this.
I've been super impressed by the application and it's responsiveness.
For reference, I am trying to analyze data from electrophysiological
experiments, which for my workflow means time series datasets of
around 1.5 to 15 million datapoints. Kst handles this like a champ
with quick panning/zooming. Even more impressive for me is that it
maintains good temporal resolution for the action potentials and
doesn't cut them short even when zoomed out. (The signals that I look
at basically run from -60 to -50 mV most of the time and then
periodically "spike" to 20 mV for 1 ms at a time. Those spikes are
action potentials and the exact timing is pretty important to my analyses)
We have an analysis workflow that uses R or python, but the plotting
speed takes sometimes tens of seconds to plot the data. I am trying to
make a small application that will allow reasonable display of the
data and be responsive to changes in analysis parameters so that you
can actually "play" with the data in an exploratory fashion.
  - Implement my analysis routines in kst using the plugin framework
    - in this case I would want to capture mouse events on the plot
also. Are there bindings for mouse/keyboard events on the plot?
  - Implement parts of the KST plotting algorithm for curves in R or
python
I'm happy to try to move forwards in whatever way is going to make the
most sense, but in either case I would be really interested in getting
a sense of the algorithm that Kst is using to plot curves because it
seems like there is some quite sophisticated stuff going on here. I
would have thought fast plotting would be a broadly solved problem,
but really in the open source space Kst is the only solution I've
found that works at this scale. (Maybe also gnuplot, but I haven't
explored it very carefully yet).
Thank you in advance for any insight or pointers on places in the
source that I should be looking at! And if this is explained somewhere
on the website or in a blog post sorry for my cluelessness.
-Erik
--
C. Barth Netterfield
416-845-0946
Erik Li
2018-05-10 17:46:04 UTC
Permalink
Dear Barth,

Thanks for the explanation - it really is a clever method! I appreciate you
taking the time to break it down and for all the time/effort that has been
put into this piece of software.

Regarding pykst, I had taken a look at it and it seems like a great option
for plotting but (unless I missed it) doesn't support reporting of
mouse/keyboard events, so it would be hard for me to implement some of the
interactive analysis features I was hoping to include. It will likely be a
great option for my personal analysis, but for some of my lab members less
direct interaction with the script = better. It may still be a good option
depending on where I decide to make the tradeoff between ease of
implementation/ease of use.

Thanks again,
Erik

On Wed, May 9, 2018 at 10:47 PM, Barth Netterfield <
Post by Barth Netterfield
Hi, Erik,
Sorry for the slow reply - KST may be fast, but apparently I'm not always.
If the X axis vector is monotonically increasing, then the curve drawing
Noting that the X resolution is finite (especially compared to a very
large vector).
Noting that one only needs to draw a line from the minimum value in a
column of pixels to the maximum value in the row of pixels in order to not
miss anything.
and then doing the obvious thing. Since looping through points and doing
compares is far faster than drawing lots and lots of lines, the drawing
gets way way faster.
As to your problem: You could consider using pykst. You would do all of
your math and manipulation in python, as you do now, then plot it in kst.
You can plot numpy arrays in kst. It takes about 0.14s to copy a
15,000,000 point numpy array into a running kst session and plot it. You
should be able to overwrite and re-draw the vector in kst at will, but
still have a fully live kst session to zoom and scroll, etc. pykst has
full control of most of the kst session, so this should actually work
pretty well I think.
Take a look at https://na01.safelinks.protect
ion.outlook.com/?url=https%3A%2F%2Fkst-plot.kde.org%2Fpykst%
2F&data=02%7C01%7Cezl24%40drexel.edu%7C5e8786f1e25d483
aa2e408d5b6206d03%7C3664e6fa47bd45a696708c4f080f8ca6%7C0%
7C0%7C636615172756865682&sdata=cVFXn1pFugyRROMkQJE9kihV
EOzIYUCfQziKaA6sBIo%3D&reserved=0
cbn
Post by Erik Li
Hi all,
What makes KST so fast? I've tried to look at the source but I'm having a
hard time getting a high-level overview of what is going on. I see
interpolation being defined in libkst/vector.cpp and can see some places
where it's called in libkstmath/curve.cpp, but the way that NS (num
samples) is being set would suggest to me that the interpolation is in fact
returning the raw data. My naive thought of how this should work is that
you essentially subsample the data to plot a smaller portion of the data,
but the accuracy with which kst plots things at different scales somewhat
belies this.
I've been super impressed by the application and it's responsiveness. For
reference, I am trying to analyze data from electrophysiological
experiments, which for my workflow means time series datasets of around 1.5
to 15 million datapoints. Kst handles this like a champ with quick
panning/zooming. Even more impressive for me is that it maintains good
temporal resolution for the action potentials and doesn't cut them short
even when zoomed out. (The signals that I look at basically run from -60 to
-50 mV most of the time and then periodically "spike" to 20 mV for 1 ms at
a time. Those spikes are action potentials and the exact timing is pretty
important to my analyses)
We have an analysis workflow that uses R or python, but the plotting
speed takes sometimes tens of seconds to plot the data. I am trying to make
a small application that will allow reasonable display of the data and be
responsive to changes in analysis parameters so that you can actually
"play" with the data in an exploratory fashion.
- Implement my analysis routines in kst using the plugin framework
- in this case I would want to capture mouse events on the plot also.
Are there bindings for mouse/keyboard events on the plot?
- Implement parts of the KST plotting algorithm for curves in R or python
I'm happy to try to move forwards in whatever way is going to make the
most sense, but in either case I would be really interested in getting a
sense of the algorithm that Kst is using to plot curves because it seems
like there is some quite sophisticated stuff going on here. I would have
thought fast plotting would be a broadly solved problem, but really in the
open source space Kst is the only solution I've found that works at this
scale. (Maybe also gnuplot, but I haven't explored it very carefully yet).
Thank you in advance for any insight or pointers on places in the source
that I should be looking at! And if this is explained somewhere on the
website or in a blog post sorry for my cluelessness.
-Erik
--
C. Barth Netterfield
416-845-0946
Loading...