This post is for people interested in doing optimisation or inference with Differential Equation (DE) models.

If you are a statistician, you might be used to treating model simulators as black boxes where you can stick parameters in and get outputs out. This post is about why you need to be a bit careful with that. It examines one of the quirks of working with differential equations and optimisation/inference that my team have bumped into in a few distinct situations – including simulators given out for public optimisation competitions! I haven’t seen it referred to in any of the textbooks, but please let me know in the comments if you have.

Below in Figure 1 is a likelihood surface (or objective function) that we came across (more on the definition of it below), as a function of one of the parameters in a cardiac action potential model. We are trying to find the maximum in this case.

Not all optimisers rely on a nice smooth gradient – but they do all enjoy them! This is a horrible surface and no matter what kind of optimiser you use it is going to struggle to move around and explore something that looks like this. The red line marks the data-generating value in this case, and the green is somewhere we got stuck. Remember this is only in one dimension, now imagine it in ten or more…

To make matters worse, we might want to run MCMC on this surface to get a posterior distribution for the parameter on the x-axis. We see that there are ‘spikes’ of about 40 log-likelihood units. What does that mean? Well if we are talking about the probability of accepting a trough from a spike in Figure 1 using an MCMC Metropolis-Hastings step, that equates to an acceptance ratio of exp(-40) = 4×10^-18 ! Our chains will certainly get stuck and never move across this space nicely.

Is the problem really so non-linear that is has got thousands of local minima, or modes in a posterior, as this suggests? Thankfully, the answer is ‘No’!

After a bit of detective work we figured out that this bumpy surface is entirely due to numerical error in our simulation, and it should be completely smooth! The example is from an Ordinary Differential Equation (ODE) solver but Partial Differential Equation (PDE) solvers will also give the same behaviour.

Most of the time we can’t derive exact analytic solutions to our models’ equations, so we have to use numerical solution techniques; the simplest of these is the Forward Euler method. These numerical methods give you only an approximation to the solution of your equations, which you try to ensure is accurate by taking more computational effort by adding steps in your approximation (finer time steps) and checking the solution is converging to an answer. As you keep refining, the solution should change less and less.

Broadly speaking we can classify the different ODE solvers into: ** fixed step**, like the Forward Euler method, that take the same size time steps as they go along; and

With an adaptive time-step solver you give a target tolerance (relative to the size of the variables (RelTol), or absolute (AbsTol), or typically both) and it refines the steps to try to maintain these tolerances on each step. In the example here we used CVODE but another common one is the Matlab ode15s stiff ODE solver. The same principle would also apply if you use a fixed-step solver, it would need smaller time steps rather than tighter tolerances.

In Figure 2 we show the shift in the likelihood surface as we tighten the ODE solver tolerances (Relative, Absolute in brackets above each plot):

In general RelTol = 10^-4 and AbsTol = 10^-6 are not unreasonable choices for a single ODE solve, indeed Matlab’s *defaults* are RelTol = 10^-3 (less precise than Figure 1) and AbsTol = 10^-6 (the same).

So why is this effect so big?

**Likelihoods**

A very common assumption is that a ‘data generating process’ (the way that you end up with observations that some instrument records) is:

data = reality + observation noise on each data point

Another common assumption is that the noise here is Gaussian, **independent** on each data point and **identically-distributed** (comes from a Normal distribution with the same mean (often zero) and standard deviation), this is known as “**i.i.d.**” Gaussian noise.

A third assumption is that ‘reality’ in our equation above is given by the smooth noise-less model output. This is obviously a bit shaky (because no model is perfect), but the idea is you can still get useful information on the parameters within your model if it is close enough (N.B. bear in mind you might get overconfident in the wrong answer – this is a good paper explaining why). So we then commonly have:

data = model output + i.i.d. Gaussian noise.

We can then write down a log-likelihood (log just because it is easier to work with numerically…) and we end up with a big sum-of-square errors across all of our time trace:

(see the Wikipedia derivation from the Normal probability density function). Here we take the mean to be the model output given some parameter set; x to be the observed data points and sigma is the i.i.d. noise parameter.

The reason that we have come across this problem perhaps more than other people isn’t that we have been more sloppy with our ODE solving (we put some effort into doing that relatively well!), but that we are dealing with problems that consist of high frequency samples of time-series data. We commonly work with a few seconds of 10kHz time sampled recordings, so we can end up with around 100,000 data points.

Why is this important? Say your simulation and data diverge by >=1.1 standard deviations of the noise level (P<0.86 in a statistics table) instead of >= 1 standard deviation (P<0.84) because of numerical error. If this happens at 100 time points then your probabilities multiply and become 0.86^100 = 5×10^-7 and 0.84^100 = 3×10^-8. It has become almost ten times less likely that your parameters gave rise to the data because of your numerical error that had a relatively small effect on the solution at each time point. As we have more and more data points, this effect is exaggerated until even tiny shifts in the solution have huge effects on probabilities, as we saw above.

There’s a slight subtlety here: you might have already checked that your solution is converging to within a pre-specified tolerance *for a given parameter set. *For example a modeller might say “I don’t care about changes of less than 0.01% in these variables, so I set the solver tolerance accordingly” then a statistician treating the simulator as a black box might just run with that. But what is important here is not the error bound on the individual variables at a given parameter set, but the error bound that the likelihood transformation of these variables demands in terms of reducing jumps in likelihood *as a function of parameters*. So the modeller and statistician need to talk to each other here to work out whether there might be problems…

**Conclusions**

I wouldn’t be surprised to find that this is one of the reasons people have found the need to use things like genetic algorithms in cardiac problems. But I suspect the information content, un-identifiability and parameter scalings are also very important factors in that.

So what should you do?

Examine * 1D likelihood slices.* We can fix all parameters and vary one at a time, plotting out the likelihood as above. Then tighten your solver tolerances until 1D slices of your likelihood are smooth enough for optimisers/MCMC to navigate easily. Whatever this extra accuracy costs in additional solver time will be compensated in far more efficient optimisation/inference (in the examples we have looked at, the worst cost is approximately just 10% more solve time for a solve with 10x tighter tolerances, resulting in thousands of times speed up in optimisation).

What about * thinning the data*? A way to get rid of this problem would be to remove a lot of data points. Something that’s called ‘thinning’ in the MCMC literature (although it usually refers to the MCMC chain afterwards rather than the data). I’m not a fan of doing it to the data. It will artificially throw away information and make your posteriors wider than they should be according to your noise model. You might not completely trust your likelihood/noise model, but thinning doesn’t automatically fix it either!

Finally, this post wouldn’t be complete without mentioning that there is a relatively new way to consider this effect, which explicitly admits that we have error from the solver, and treats it as a random variable (which can be correlated through time):

data = model + numerical approximation error + observation noise.

Dealing with this formulation is the field known as *probabilistic numerics* – see the homepage for this, and you can use it to make MCMC take account of numerical errors. In our case, I expect this approach could help by effectively warming up (c.f. tempering methods) the likelihood and making the spikes relatively smaller and more jump-able. Interestingly, in the above plots you can see that this isn’t independent noise as you move through parameter space, I don’t know enough about the subject to say whether that has been handled or not! Whether it is worth the extra complication I’m not convinced. Maybe for big PDE models it will be worth the trouble, but for the reasonably lightweight ODEs involved in single cell cardiac work it is probably just worth solving more accurately all the time.

On the 9th November 2017 the CiPA in-silico Working Group hosted a meeting in Toronto General Hospital that the Cardiac Physiome meeting kindly let us run as a satellite meeting – a big thanks to them for organising the logistics of room booking etc.

The in-silico aspects of CiPA are led by the FDA Center for Drug Evaluation and Research. You might find the background document that we put together useful if you haven’t heard of CiPA. I’ve also written a post on the idea before. The FDA team let me organise this long half day with the following aims:

- To inform the cardiac modelling community about the CiPA initiative.
- To get feedback on the FDA’s work to date.
- To draw attention to other research in the area they might not have been familiar with.
- To discuss the next steps.
- To spark more research and collaborations in this area.

It was a fascinating and thought provoking day, plenty of work for us to do, as you’ll see on my summing up slides at the end of the day. Here are links to all the talks, that you can also find in a Figshare Collection.

- David Strauss (FDA): Introduction and Regulatory Perspective
- Then we had some talks on hERG modelling, the appropriate Markov model structure to use for the baseline/control, and also how to measure drug binding kinetics and their importance:
- Kylie Beattie (FDA, GlaxoSmithKline): Selecting a model of hERG channel kinetics and drug binding
- Wendy Wu (FDA): An experimental perspective on measuring ion channel block
- Randall Rasmusson (Buffalo): Modeling HERG and IKr for pro-arrhythmic drug interactions
- Adam Hill (Victor Chang): Measuring (and modelling) kinetics of drug binding to hERG – does it matter and can we do it?

- We then considered the process of testing and optimising the baseline action potential model (O’Hara-Rudy 2011) for studying drug action:
- Zhihua Li (FDA): Optimization of Cardiac Myocyte Model for CiPA Initiative
- Trine Krogh-Madsen (Cornell): Global optimization of ventricular myocyte model to multi-variable objective improves predictions of drug-induced Torsades de Pointes

- This was followed by a talk on modelling stem-cell derived myocytes
- Brian Carlson (University of Michigan): An Expression-based Theoretical Model of Human iPSC-derived Cardiomyocytes

- Then the role of L-type Calcium channel and the importance of characterising its baseline/control kinetics when considering conductance block effects at the action potential level
- The next section looked at how we might validate model predictions, which started with the FDA team outlining their choice of metric, validation plans, and how they had performed Uncertainty Quantification.
- We then considered whether far simpler models could give the same kind of predictions
- Jaimit Parikh (IBM): Do in-silico models provide improved risk prediction of

drug-induced Torsades de Pointes?

- Jaimit Parikh (IBM): Do in-silico models provide improved risk prediction of
- Then finished this section by considering whether real-world risk data could be used to continually update risk categories and move to a more continuous risk score
- Mark Davies (QT Informatics): Our validation criteria should be determined

by our understanding of the road ahead

- Mark Davies (QT Informatics): Our validation criteria should be determined
- There then followed a discussion panel made up of Elisa Passini, Ele Grandi, Sebastian Polak and myself.
- Finally we did a bit of summing up before a well-earned meal and drinks!
- Gary Mirams (Nottingham): Summing Up

]]>

In their 2009 simulation study comparing properties of Hodgkin Huxley vs. Markov Models (well worth a read) Martin and Denis discussed how an optimised short voltage step protocol might contain enough information to fit the parameters of models (termed an ‘identifiable’ model/protocol combination) in a relatively short amount of experimental time.

We picked up on these ideas when Kylie came to look at models of hERG. We originally wanted to study different modes of drug binding with hERG and design experiments to quantify that. Unfortunately, it rapidly became clear there was little consensus on how to model hERG itself, before even considering drug binding.

OK, so we have lots of different structures, but does this matter? Or do they all give similar predictions? Unfortunately – as we show in Figure 1 of the paper – quite a wide variety of different current profiles are predicted, even by models for the same species, cell type and temperature.

So Kylie’s PhD project became a challenge of deciding where we should start! What complexity do we need in model of hERG (for studying its role in the action potential and what happens when it is blocked), and how should we build one?

These questions link back to a couple of my previous posts – how complex should a model be, and what experiments do we need to do to build it? Kylie’s thesis looked at the question of how we should parameterise ion channel models, and even how to select the right ion channel model to use in the first place. We had quite a lot of fun designing new voltage clamp protocols and then going to a lab to test them out. The full story is in Kylie’s thesis, and we present a simpler version that just shows how well you can do with one basic model in the paper.

Kylie did a brilliant job, and as well as doing all the statistical inference and mathematical modelling work, she went and learnt how to do whole-cell patch clamp experiments herself as well at Teun de Boer’s lab and also with Adam Hill and Jamie Vandenberg. Patch clamp is an amazing experimental technique where you effectively get yourself an electrode in the middle of a cell, my sketch of how it works is in Figure 2.

We decided that the traditional approach of specific fixed voltage steps (which neatly de-couples time- and voltage-dependence) was a bit slow and tricky to assemble into a coherent model. So we made up some new sinusoid-based protocols for the patch clamp amplifier to rapidly probe the voltage- and time-dynamics of the currents. Things we learnt along the way:

- Whilst it might work in theory for the model, you also might fry the cells in real life (our first attempts at protocols went up to +100mV for extended periods of time, which cells don’t really like).
- HEK and CHO cells have their own voltage-dependent ion channels (which we call ‘endogenous’ voltage-dependent currents) which you can activate and mix up with the current you are interested in.
- It’s really important to learn what all the dials on a patch clamp amplifier do(!), and adjust for things like liquid junction potential.
- Synthetic data studies (simulating data, adding realistic levels of noise, and then attempting to recover the parameters you used) are a very useful tool for designing a good experiment. You can add in various errors and biases and see how sensitive your answers are to these discrepancies.
- Despite conductance and kinetics being theoretically separable/identifiable, and practically in synthetic studies, we ended up with some problems here when using real data (e.g. kinetics make channel ‘twice as open’ with ‘half the conductance’. You can imagine this is impossible if the channels are already over 50% open, but maybe quite likely if only 5% of the channels are open?). We re-designed the voltage clamp to include an activation step to provoke a nice large current with a large open probability, based on hERG currents people had observed before.

But to cut a very long story short – it all worked better than we could have imagined. Figure 3 shows the voltage protocol we put into the amplifier, and the currents we recorded in CHO cells that were over-expressing hERG. We then fitted our simple Hodgkin-Huxley style model to the current, by varying all of its parameters to get the best possible fit, essentially.

So a great fit, but that doesn’t mean anything on its own – see my previous post on that. So we then tested the model in situations that we would like it to make good predictions, here under cardiac action potentials and also slightly awry ones, see Fig 4.

We repeated this in a few different cells, and this lets us look at cell-cell variability in the ion channel kinetics via examining changes in the model’s parameters. Anyway, that is hopefully enough to whet your appetite for reading the whole paper! As usual, all the data, code, and (perhaps unusually) fitting algorithms are available for anyone to play with too.

**Wish list: if you can help with any of these, let’s collaborate! Please get in touch.**

- A better understanding of identifiability of conductance versus kinetic parameters, and how to ensure both.
- A way to design voltage clamp protocols for particular currents (this was somewhat hand tuned).
- A way to select between different model structures at the same time as parameterising them.
- A way to say how ‘similar’ (in terms of model dynamics?) a validation protocol is to a training protocol. If validation was too similar to training, it wouldn’t really be validation… we think our case above is ‘quite different’, but could we quantify this?
- A way to quantify/learn ‘model discrepancy’ and to put realistic probabilistic bounds on our model predictions when we are using the models “out in the wild” in future unknown situations.

**hERG* is the gene that encodes for the mRNA that is translated into a protein that assembles into homotetramers (groups of four of the same thing stuck together) in the cell membrane. This protein complex forms the main part of a channel in the cell membrane (Kv11.1) that carries the ionic current known as the “rapid delayed rectifier potassium current” or I_{Kr}. So you can see why we abuse the term hERG and say things like “hERG current”!

I am a massive fan of open access publication and open science in general. It is quite sensible that the public gets to read all of the research they are funding, and it has to be the best way to share ideas and let science happen without any barriers.

But I’m sure we aren’t doing it very efficiently at the moment, some very well-intentioned policies are making publishing a real nuisance in the UK.

Here’s a list of the all the places where papers we are publishing at the moment are ending up. When you google a paper title, you are likely to find hits for all of these, you have to hope that they all ended up being the same final version of a paper, and aren’t really sure which is best to look at:

**On ArXiv/BioRxiv**– I think preprint servers are a great way to make a version of your paper open access, get it google-able, and get feedback on it. So we put up papers on BioRxiv, and try (but sometimes forget) to make sure they are updated to match the final accepted article in a journal.**In the actual journal**– this is generally the nicest to look at version (but not always!). My funders like to have their articles under a CC-BY licence, which is a great idea, but it generally means a Gold route for open access with quite high fees.**On PubMedCentral(PMC)**or Europe’s version (or usually for us, both) – PMC is funded by the NIH, the USA’s main medical research agency, and any papers they funded also have to deposit a version with PMC. This applies even if it is open access – fair enough – I imagine it’s a good idea to have an ‘official backup’ in case a journal shuts down for any reason. Since my funders go for Gold open it is somewhat redundant, and confuses people when they search on PubMed and have to choose which version to look at, but at least it is a big repository with almost all biomedical research in one place (give or take the European version – please just pool your resources EU and USA! Does Brexit mean we’ll also have to put a version in a UK-PMC too? Probably… groan).**On a university/institutional archive**– the UK powers-that-be have (very sensibly) decided that (almost) all papers have to be available open access to be eligible for consideration as part of the next Research Excellence Framework which decides how much public money universities get. As far as I can work out, every single university has decided (very un-sensibly) that the only way to ensure this is to launch their own individual paper repository where they also host another open access version of the paper. Ours is called ePrints.**On a couple of other institutional archives**– nearly all my papers have co-authors in other universities, who also have to submit the paper again to their own institutional repositories!

So, every single university in the UK is creating the IT databases, infrastructure, front-ends to host large numbers of research papers in perpetuity; as well as employing staff to curate and chase academics to put the right versions into the right forms at the right time with the right licences to keep everyone happy, mostly for papers that are already available open access elsewhere. This is an insane use of resources.

The only thing I can suggest is that UK REF people clarify that any paper that has a final open access accepted text in either arXiv/BioRXiv/a journal/PMC/EuropePMC is automatically eligible. For papers that doesn’t cover, the universities need to get together to either: beef up support for subject-specific repositories; or, just fund a single central repository between them, with a good user interface, to cover any subjects that fall down the cracks between the reliable subject repositories above. Maybe the sort of thing our highly-paid VCs and UUK should be organising

]]>

The subject of the postdoc position will be designing new experiments to get as much information as we can on how pharmaceutical compounds bind to ion channels and affect the currents that flow through them. As part of this I would like to explore how to characterise and quantify model discrepancy, and design experiments for that, as well as model selection and parameterisation of the models.

We’ll then use the data generated by these experiments to build models of pharmaceutical drug interactions with ion currents, working with partners in pharmaceutical companies and international regulators to test out these new models. The project will involve learning some of the modelling behind electrophysiology and pharmacology, as well as data science/statistics behind designing experiments and choosing models and parameters. It will build on some of our recent work on novel experimental design, some of which is available in a preprint here.

See http://www.nottingham.ac.uk/jobs/currentvacancies/ref/SCI308217 for details and links to apply. Closing date is Wed 4th October. Informal enquiries to me are welcome (but applications have to go through the official system on the link above).

Gary

]]>

This is just a short post to let everyone know that the CiPA *in-silico* Working Group is organising a meeting on November 9th 2017 in Toronto, dedicated to discussing the mathematical modelling aspects of CiPA. You can find more information about the meeting on this page, and register for the meeting on this page.

The plan is for the FDA modelling team to present the work they have been doing to the cardiac modelling community, to get feedback, encourage work in this area, and to network and start new collaborations. Also note there are a handful of speaker slots we are keeping free for late-breaking-abstracts which you can email me to submit short abstracts for (details here) by 30th September.

Please pass this message on to anyone you think may be interested.

Hope to see lots of you there!

Gary

]]>The advisory committee asked some great questions, and I thought it was worth elaborating on one of my answers here. To summarise, quite a few of their questions came down to * “Why don’t you include more detail of known risk factors?”*. Things they brought up include:

- long-short-long pacing intervals is often observed in the clinic prior to Torsade de Pointes starting, why not include that?;
- should we model Purkinje cells rather than ventricular cells (perhaps ectopic beats or arrhythmias arise in the Purkinje fibre system)?;
- heart failure is a known risk factor – would modelling these conditions help?

Before answering, it’s worth considering where we are now in terms of ‘mechanistic’ markers of arrhythmic risk. Figure 1 shows how things are assessed at the moment.

It was observed that the risky drugs withdrawn from the market in the late 90s prolonged the QT interval, and for these compounds this was nicely mechanistically linked to block of the hERG channel/IKr (top panel of Fig 1). This all makes nice mechanistic sense – a prolonged QT is related to delayed repolarisation (as in bottom of Fig 1), which in turn is related to block of hERG/IKr.

There’s a couple of reasons prolonged repolarisation is conceptually/mechanistically linked to arrhythmia. Firstly, if you delayed repolarisation ‘*a bit more’* (continue to decrease the slope at the end of the action potential – middle panel of Fig 1), you’d get repolarisation failure, or after-depolarisations. Secondly, by delaying repolarisation you may cause regions of tissue to fail to be ready for the following wave, termed ‘functional block’.

As a result of the pathway from hERG block to QT prolongation, early ion channel screening focusses on checking compounds don’t block hERG/IKr. The clinical trials tried to avoid directly causing arrhythmias, for obvious reasons, but by looking for QT prolongation in healthy volunteers you would hopefully spot compounds that could have a propensity to cause arrhythmias in unhealthy heart tissue, people with ion channel mutations, people on co-medication with other slightly risky compounds, or other risk factors. This has been remarkably successful, and there have very few surprises of TdP-inducing compounds sneaking past the QT check without being spotted.

But, there were some hERG blockers on the market that didn’t seem to cause arrhythmias. Our 2011 paper showed why that can happen – there are different mechanistic routes to get the same QT or APD changes (by blocking multiple ion channels rather than just hERG) and if you took multiple ion channel block into account you would get better predictions of risk than simply using the early hERG screening results. So multiple ion channel simulations of single cell APD are a very similar idea to clinical trials of QT (and comparing the two is a good check that we roughly understand a compound’s effects on ion channels).

So clinical QT/simulated APD is a mechanistically-based marker of arrhythmic risk, but we know it still isn’t perfect because some drugs with similar QT prolongation in our healthy volunteers have different arrhythmic risks (see CiPA papers for an intro).

At one end of the scale, some studies advocate whole-organ simulations of TdP in action to assess TdP risk. Here’s a video of the impressive UT-heart simulator from Tokyo that was used in that study.

These simulations definitely have their place in helping us understand the origin of TdP, how it is maintained/terminates, and possibly helping design clinical interventions to deal with it. If we want to go the whole hog and really assess TdP risk completely mechanistically why not do patient-specific whole organ TdP simulations, with mechanics, individualised electrophysiology models, all the known risk factors, the right concentrations of compound, and variations of these throughout the heart tissue, etc. etc.?

Let’s imagine for a minute that we could do that, and got a model that was very realistic for individual patients, and we could run simulations in lots of different patients in different disease states, and observe spontaneous initiation of drug-induced TdP via the ‘correct’ mechanisms (this hypothetical situation ignores the not-inconsiderable extra uncertainties in how well we model regional changes in electrophysiology, what changes between individuals, blood pressure regulation models, individual fibre directions, accurate geometry, etc. etc. etc. which might mean we get more detail but less realism than a single cell simulation!). Let’s also say we could get these huge simulations to run in real time – I think the IBM Cardioid software is about the fastest, and goes at about three times less than real time on the USA’s biggest supercomputer.

That would be brilliant for risk assessment wouldn’t it?

Unfortunately not!

TdP is very rare – perhaps occurring once in 10,000 patient-years of dosing for something like methadone. Which means ultra-realistic simulations would have to run for 10,000 years just to get an N=1 on the world’s biggest supercomputer! It’s going to be quite a while before Moore’s law makes this feasible…

Inevitably then, we are not looking to model all the details of the actual circumstances in which TdP arises, we’re looking for some marker that correlates well with the risk of TdP.

The other extreme is perhaps forgetting about mechanism altogether, and simply using a statistical model, based on historical correlations of IC50s with TdP, to assess the risk. Hitesh Mistry did something a bit like this in this paper (although as I’ve said at conferences – it’s not really a simple statistical correlation model, it’s really a clever minimal biophysically based model, since it uses Hill equation and balance of depolarising and repolarising currents!). But for two or three ion channel block it works very well.

Why would I like something a bit more mechanistic than that then? I came up with the example in Figure 2 to explain why in the FDA hearing.

Why might ion channel block be like Fig 2? Well when you’re considering just two or three channels being blocked, then Hitesh’s method (which actually includes a bit of mechanistic Newton’s II law in my analogy!) will work very well, assuming it’s trained on enough compounds of various degrees of block of the three channels, as the blue dots will cover most of the space.

But you might want to predict the outcome of block (or even activation) up to seven or more different ionic currents (and combinations thereof) that could theoretically happen and cause changes relevant to TdP risk. In this case, any method that is primarily based on a statistical regression, rather than mechanistic biophysics, is going to struggle because of the curse of dimensionality. In essence, you’ll struggle to get enough historical compounds to ‘fill the space’ for anything higher than two or three variables/ion currents. You could think of the biophysical models as reducing the dimension of the problem here (in the same way as the biology does, if we’ve got enough bits of the models good enough), so they can output a single risk marker that is then suitable for this historical correlation with risk – without a huge number of compounds.

CiPA is pursuing single-cell action potential simulations, looking for markers of arrhythmic risk in terms of quantifying ‘repolarisation stability’ in some sense. I think this is a very sensible approach, geared simply at improving one step on solely APD/QT.

In terms of including more risk factor details in here, as the committee asked originally at the top of this post, the real question is ‘*does it improve my risk prediction?*‘ or not. Hopefully I’ve explained why including all the detail we can think of isn’t obviously going to help. Your ranking of a new compound in terms of the risk of established ones would have to change in order for a new simulated risk marker to make any difference.

To assess whether that difference really was an improved risk prediction we would need to have faith that risk factors were widely applicable to the people who are getting TdP, and that any changes to the models for introducing heart failure etc. are sufficiently well validated and trusted to rely on them. I don’t think we are quite ready for this, as there is plenty to do at the moment trying to ensure there is an appropriate balance of currents in a baseline model (before any are blocked – two relevant papers: paper 1, paper 2), and that kinetics of drug-block of hERG included, as these are probably important.

Another thought along the committee’s lines is TdP risk for different patient subgroups, instead of a one-size-fits-all approach. This would be very nice, but the same difficulties apply, multiplied! Firstly, getting models that we trust for all these subgroups, with well quantified levels/distributions of ion channel expression and other risk-factor-induced changes. Secondly, even sparser gold standard clinical risk categorisation for all subgroups to test our models on. Unfortunately, with such a rare side effect it is difficult enough to get an overall risk level, never mind risk tailored to individual subgroups. So at present, I think the CiPA proposal of a single cell model (give or take an additional stem-cell derived myocyte prediction perhaps!) and single risk marker is a very sensible first step.

As usual, comments welcome below!

]]>Regular readers of this blog will know that I worry about uncertainty in the numbers we are using to model drug action on electrophysiology quite a lot – see our recent white paper for an intro where we discuss the various sources of uncertainty in our simulations, and a 2013 paper on variability observed when you repeat ion channel screening experiments lots of times. That paper studied variability in the averages of lots of experiments, in contrast, it is also possible to look at the uncertainty that remains when you just do one experiment (or one set of experiments).

We have been looking at screening data that were published recently by Crumb et al. (2016). This is an exciting dataset because it covers an unprecedented number of ion currents (7), for a good number of compounds (30 – a fair number anyway!). The ideal scenario for the CiPA initative is that we can feed these data into a mathematical model, and classify compounds as high/intermediate/low arrhythmic risk as a result. Of course, the model simulation results are only ever going to be as accurate as the data going in, I’ve sketched what I mean by this in the plot below.

In Figure 1 we can see the same scenario viewed three different ways. We have two inputs into the simulation – a ‘red’ one and a ‘blue’ one. You could think about these as any numbers, e.g. “% block of an ion current”.

If we ignored uncertainty, we might do what I’ve shown on the top row: plug in single values; and get out single outputs. Note that blue was higher than red and gives a correspondingly higher model output/prediction. But how certain were we that red and blue took those values? How certain are we that the output really is higher for blue?

Instead of just the most likely values, it is helpful to think of probability distributions for red and blue, as I’ve shown in the middle row of Figure 1. Here, we aren’t quite sure of their exact values, but we are fairly certain that they don’t overlap (this step of working out distributions on inputs is called “* uncertainty characterisation*“), so their most likely values are the same as the ones we used before in the top row. When we map these through our model* (a step called “

In the bottom row of Figure 1 we see another scenario – here there is lots of overlap between the red and blue input distributions. So we think blue is higher than red, but it could easily be the other way around. Now we map these through our model, and get correspondingly large and overlapping probability distributions on our outputs. Even in the best-case scenario (i.e. a linear model output depending solely on this input), we would end up with the same distribution overlap as the inputs; and for non-linear models, where outputs are complicated functions of many inputs, the distributions could overlap much more. So bear in mind that it isn’t possible to get less-overlapping distributions out of the model than the ones that went in, but it is possible to get more overlapping ones. The uncertainty always increases if anything (I hadn’t really thought about it before, but that might be related to why we think about entropy in information theory?).

If we consider now that the output is something like ‘safety risk prediction for a drug’, could we distinguish whether red or blue is a more dangerous drug? Well, maybe we’ll be OK with something like the picture in the middle row of Figure 1. Here my imaginary risk indicator model output distinguishes between the red and blue compounds quite well. But we can state categorically “no” if the inputs overlap to the extent that they do in the bottom row, before we even feed them in to a simulation. So we thought it would be important to do uncertainty characterisation and work out our uncertainty in ion channel block that we have from dose-response curves; before doing action potential simulations. This is the focus of our recent paper in Wellcome Open Research**.

In the new paper, Ross has developed a Bayesian inference method and code for inferring probability distributions for pIC50 values and Hill coefficients.

The basic idea is shown in Figure 2, where we compare the approach of fitting a single vs. distribution of dose-response curves:

On the left of Figure 2 we see the usual approach that’s taken in the literature, fitting a line of best fit through all the data points. On the right, we plot samples of dose-response curves that may have given rise to these measurements.

The method works by inferring the pIC50 and Hill coefficients that describe the curve, but also the observation error that is present in the experiment. i.e the ‘statistical model’ is:

data = curve(pIC50, Hill) + Normal noise (mean = 0, standard deviation = σ).

One of the nice features of using a statistical model like this is that it tries to learn how much noise is on the data by looking at how noisy the data are, and therefore generates dose-response curves spread appropriately for this dataset, rather than with a fixed number for the standard deviation of the observational noise.

At this point we calculate what the % block of the current (inputs into model) might look like from these inferred curves, this is shown in Figure 3:

The new paper also extends this approach to hierarchical fitting (saying data from each experiment were generated with different pIC50 and Hill coefficients), but I will let you read the paper for more information on that.

So what can we conclude about the distribution of inputs for the Crumb dataset? Well, that’s a little bit tricky since it’s 7 channels and therefore a seven dimensional thing to examine. To have a look at it we took 500 samples of each of the seven ion current % blocks. This gives a table where each row looks like:

Drug (1-30) | Sample (1-500) | % block current 1 | % block current 2 | ... | % block current 7 |

So the seven % blocks are the seven axes or dimensions of our input dataset.

I then simply did a principal components analysis that will separate out the data points as much as possible by projecting the data onto new axes that are linear combinations of the old ones. You can easily visualize up to 3D, as shown in the video below.

In the video above you see the samples for each compound plotted in a different colour (the PCA wasn’t told about the different compounds). So each cloud is in a position that is determined by what % block of each channel the compounds produce at their free Cmax concentrations (given in the Crumb paper). What we see is that about 10 to 12 of the compounds are in distinct distributions, so they block currents in a combination unlike other compounds, i.e. behave like the different inputs in the middle row of Figure 1. But the remaining ones seem to block in distributions that could easily overlap, like the inputs in the bottom row of Figure 1. As you might expect, this is clustered close to the origin (no block) co-ordinate.

Here, these first three principal components happen to describe 94% of the variation in the full 7D dataset, so whilst it is possible that there is some more discrimination between the ‘clouds’ of samples for each compound in higher dimensions, your first guess would be that this is not likely (but this is a big caveat that needs exploring futher before reaching firm conclusions). So it isn’t looking promising that there is enough information here, in the way we’ve used it anyway, to distinguish between the input values for half of these compounds.

But once you’ve got this collection of inputs you can simply do the uncertainty propagation step anyway and see what comes out. We did this for a simple O’Hara 1Hz steady state pacing experiment, applying conductance block according to the video’s cloud values for each of the 30 compounds, to work out predicted output distributions of APD90 for each compound. The results are shown in Figure 4.

Figure 4 is in line with what we expected from the video above, so it causes us to have some pause for thought. Perhaps five compounds have distinctly ‘long’ APD, and perhaps two have distinctly ‘short’ APD, but this leaves 23 compounds with very much overlapping ‘nothing-or-slight prolongation’ distributions. The arrhythmic risk associated with each of these compounds is not entirely clear (to me!) at present, so it is possible that we could distinguish some of them, but it is looking a bit as if we are in the scenario shown in the bottom right of Figure 1 – and this output overlaps to such an extent that it’s going to be difficult to say much.

So we’re left with a few options:

- Do more experiments (the larger the number of data points, the smaller the uncertainty in the dose-response curves), whether narrower distributions allow us to classify the compounds according to risk remains to be seen.
- Examine whether we actually need all seven ion currents as inputs (perhaps some are adding noise rather than useful information on arrhythmic risk).
- Get more input data that might help to distinguish between compounds – the number one candidate here would be data on the kinetics of drug binding to hERG.
- Examine other possible outputs (not just APD) in the hope they distinguish drugs more, but in light of the close-ness of the input distributions, this is perhaps unlikely.

So to sum up, it is really useful to do the uncertainty characterisation step to check that you aren’t about to attempt to do something impossible or extremely fragile. I think we’ve done it ‘right’ for the Crumb dataset, and it suggests that we will need more, or less(!), or different data to distinguish pro-arrhythmic risk. Comments welcome below…

* The simplest way to do this is to take a random sample of the input probability distribution, run a simulation with that value to get an output sample, then repeat lots and lots of times to build up a distribution of outputs. This is what’s known as a Monte Carlo method (using random sampling to do something that in principle is deterministic!). There are some cleverer ways of doing it faster in certain cases, but we didn’t use them here!

** By the way: Wellcome Open Research is a new open-access, open-data, journal for Wellcome Trust funded researchers to publish any of their findings rapidly. It is a pre-print, post-publication review journal, so reviewers are invited after the manuscript is published and asked to help the authors revise a new version. The whole review process is online and open, and it follows the f1000research.com model. So something I’m happy we’ve supported by being in the first issue with this work.

]]>Ablation is a treatment for some abnormal heart rhythms, which works by killing areas of the heart that we think are misbehaving and giving rise to abnormal rhythms. The idea behind ablation is that if you identify and kill a region of tissue that is allowing electrical signals to propagate ‘the wrong way’, then afterwards you will have created a non-conducting wall of scar tissue that prevents any abnormal electrical activity. As a treatment strategy, there’s clearly room for improvement; I hope we’ll soon look back on it as Dr McCoy does for dialysis in the scene in Star Trek IV. But at the moment, it’s the best we’ve got for certain conditions.

Usually, and ideally, you stop ablating when the arrhythmia terminates, so it is very effective in the short term. Unfortunately the success rate in the long term isn’t great – after a single round of atrial ablation people stay (for at least 3 years) arrhythmia free in only about 50% of cases^{1}. So why do people re-lapse after ablation? Well it can be because electrical waves somehow get through the scar regions (see ^{1} for further references to this).

You also get similar scars forming after a heart attack – in this case the cardiac muscle itself doesn’t get a blood supply, and dies because of that, leaving a scar region. This damage leads to scars with larger border zones than in ablation, as in these border zones have low but not-zero blood supply, so you get semi-functioning tissue where some of the muscle cells (myocytes) survive, and others don’t. The problem now is the opposite to ablation – you’d like to restore conduction in these regions to get the rest of the heart beating as well as it can, but electrical activity is disturbed by the presence of these scar regions and their border zones, and arrhythmias become more likely (or even occur instantly in the case of big heart attacks).

Greg works on optical mapping for whole hearts in various species to study what happens around these scars. This is a clever technique where you put a dye into a tissue, it sticks to the membrane of cells, and when excited by an external light source, it emits light at an intensity dependent on the voltage across the cell membrane. So you can excite the dye with light at a certain wavelength, and record the light emitted at another wavelength in a camera to create a map of electrical activity in your sample, and you can do this in real time to see how electrical waves move across the heart. Here’s a simulation of electrical activity that Pras did with our Chaste software, and it includes a comparison with optical mapping at 39 seconds in the video below (this is for a completely different situation – just to show you how optical mapping works!):

Greg found that in some intact hearts from mice recovering from ablation he could observe the optical mapping signal appearing to go straight through the middle of the scars. This is unexpected, to say the least, and required **a lot** of further investigation. I’ll let you read the full story in the paper, but suffice to say we confirmed that:

- there aren’t any myocytes (normal heart muscle cells) left in the scars (shown by doing histology and using an electron microscope) and there are lots of what are largely fibroblasts left there, the cells in the scars are definitely ‘non myocytes’ anyway;
- the observations aren’t just optical mapping artefacts (shown by doing direct micro-electrode recordings, showing that local electrical pulses from a suction electrode diffuse in, and even putting black foil in the middle of the hearts!);
- the observations depend on electrical coupling between myocytes and fibroblasts, as it goes away if you don’t have this (shown by specifically breeding a strain of mice where you could knock out Connexin43 in just the non myocytes – clever stuff!). In fibroblasts, Connexin 43 is thought to form gap junctions electrically coupling them to only myocytes, not to other fibroblasts. If you knock out Connexin43 proteins in the non myocytes with this special mouse strain, then the optical mapping signal you record is much smaller in the scar region; the electrical signal no longer diffuses in from a suction electrode, indicating that fibroblast-myocyte junctions are required to see conduction into the scar.*

To quote from the article, this is “the first direct evidence in the intact heart that the cells in myocardial scar tissue are electrically coupled to the surrounding myocardium and can support the passive conduction of action potentials.”

When Greg presented the experimental results at conferences he had a hard time convincing people that the experimental results were real, and not some odd artefacts of the experimental set up! It seemed that nobody expected electrical waves to travel through these regions, even if they were populated with cells, as these cells weren’t cardiac myocytes. It was counter-intuitive to most cardiac electrophysiologists that any signal would get across this gap, because that’s the whole point of ablation! This is where we thought it would be quite interesting to see what you would expect, from the standard model of electrical conduction, if there are neutral non-myocyte cells in a lesion that are coupled together. By *neutral* we mean not providing any ‘active’ transmembrane currents, and just providing passive electrical resistance.

Our standard simplest model for the reaction-diffusion of voltage in cardiac tissue is the monodomain equation (so-called because there’s an two-domain extension called the bidomain equation):

There are a few terms here that need defining: *V* is voltage across the cell membrane – which is the quantity that we consider to be diffusing around, rather than the individual types of ions themselves. *I _{ion}* is the current across a unit surface area of the cell membrane at any point in space (varies across space),

So, what will be different in the lesion?:

**𝜎**, the cell-cell conductivity, could be varying. This would represent how much Connexin was present to link the lesion cells together with gap junctions.- 𝜒 , the surface area of membrane in a unit volume of tissue, could also be varying. This would represent non-myocyte membrane density in the lesion.
- 𝐶
_{𝑚}, the capacitance of a unit of membrane area, probably won’t change (membrane made of same stuff). Beware though, a maths/biology language barrier means experimentalists might call 𝜒 ‘capacitance’ to confuse us all! *I*will be small compared to myocytes, these cells aren’t actively trying to propagate electrical signals. So we did the study twice, once with this set to zero in the scar, and once with it set to use a previously published model of fibroblast electrophysiology (didn’t make any visual difference to the results)._{ion}

Now in the scar region we aren’t applying an external stimulus so *I _{stim} = 0*, and we can divide through by χ to get:

A nice property springs out – that the entire behaviour of the scar region (in terms of difference to the rest of the heart) is determined by the ratio of * 𝜎/χ*. So we introduce a factor ρ that will scale this quantity.

Even before we run any simulations, we’ve learnt something here by writing down the model and doing this “parameter lumping” (see nondimensionalisation for a framework in which to do this kind of thing rigorously!). Just by looking at ρ = 1 we see that there are an infinite number of ways we could get exactly the same behaviour. The scar cells could be just as well coupled (* 𝜎*) and just as densely packed (

So we did a simulation with a mouse ventricular myocyte model, on a realistic sized bit of tissue 5mm x 5mm, with a 2mm diameter lesion in the middle. There are no preferential fibre directions here, something that could easily make the conduction more or less likely.

So what behaviours did we predict with different values of ρ? First for ρ = 1; the * 𝜎*/

In the video we see that conduction does indeed ‘get across’ the scar, even though it is not ‘driven’ as such as it is in the rest of the tissue, but instead is simply diffusing across that region. We predict the wave will get across a 2mm scar easily with this value of ρ.

What about with more membrane (or less conductivity): ρ = 0.1 (Remembering this would happen whenever * 𝜎*/

This time, you can see that the wave struggles to get across the lesion, but the membrane voltage still gets high, and that would probably still be high enough to record in optical mapping.

With less membrane density relative to the well-coupled-ness of the lesion cells (ρ = 10)? (Remembering this would happen whenever * 𝜎*/

So our conclusion was that it’s perfectly possible that a voltage signal could be recorded in the lesion, and a voltage signal could effectively travel straight through the scar, and conduction carry on out the other side. We’re not entirely sure about the value that ρ should take – but this behaviour was fairly robust, and matched what we saw in the experiments.

This simple model predicted many of the features of the recordings that were made from the scar region – see Figure 7 in the paper, and compare with experiments in Figure 4. So, it helped Greg answer accusations of “This is impossible!” that he got when he presented stuff at conferences, as he could reply that “If the cells have gap junctions, even without any voltage-gated ion channels of their own, this is exactly what we’d expect – see!”.

As usual, all the code is up on the Chaste website if anyone wants to have more of a play with this.

1 “Long-term Outcomes of Catheter Ablation of Atrial Fibrillation: A Systematic Review and Meta-analysis” Ganesan et al. J Am Heart Assoc.(2013)2:e004549 doi:10.1161/JAHA.112.004549

*Incidentally, I’d like to do a sociology experiment where I give biologists and mathematicians logic problems to do mentally. My hypothesis is that biologists would beat mathematicians, as they are always carrying around at least ten “if this then that” assumptions in their heads. Then I’d give them a pen and paper and see if the situation reversed…

]]>In our case this is often ‘% inhibition’ for a given ionic current, a consequence of a drug molecule binding to, and blocking, ion channels of a particular type on a cell’s membrane.

A while back we were interested what the distribution of IC50s would be if you repeated an experiment lots and lots of times. We asked AstraZeneca and GlaxoSmithKline if they had ever done this, and it turned out both companies had hundreds, if not thousands, of repeats as they did positive controls as part of their ion channel screening programmes. We plotted histograms of the IC50 values and Hill coefficients from the concentration – effect curves they had fitted, and found these distributions:

After some investigation, we found the standard probability distributions that both the IC50s and Hill coefficients from these experiments seemed to follow very well were Log-logistic.

Now, where do pIC50s come in? A pIC50 value is simply a transformation of an IC50, defined with a logarithm as:

pIC50 [log M] = -log_{10}(IC50 [M]) = -log_{10}(IC50 [μM]) + 6

At this point it is good to note that logarithms to the base 10 (also known as common logarithms) were invented by Henry Briggs who, like me, was born in Halifax!

The ‘pIC50’ is analogous to ‘pH‘ in terms of its negative log relationship with Hydrogen ion concentration, which is more familiar to most of us.

The logarithm means that pIC50s end up being distributed Logistically, as:

ln(X) ~ Loglogistic(α, β). ...(1)

implies that

X ~ Logistic(μ, σ), ...(2)

as shown below:

You can see how nicely these distributions work across a lot of different ion currents and drug compounds in the Elkins et al. paper supplement.

More recently, we ran into an interesting question: “Given just a handful of IC50 values should we take the mean value as ‘representative’, or the mean value once they are converted to pIC50s”?

Well, ideally, you’d try and do rigorous inference on the parameters of the underlying distribution – as we did in the original paper – and as our ApPredict project will try to do that for you if you install that. To do that across multiple experiments you probably want to use mixed-effect or hierarchical inference models. But it’s fair enough to want a rough and ready answer in some cases, especially if you don’t know much about the spread of your data. A tool to try and infer the spread parameter σ from a decent number of repeated runs of the same experiment is something I’ll try to provide soon.

But, let’s just say you want to have this ‘representative’ effect of a drug, given a handful of dose-response curves. You’ve got a few options: you could take a load of IC50 values, and take the mean of those; or you could take a load of pIC50 values, and take the mean of those. Or perhaps the median values? But which distribution should you use? Which would be more representative, and what is the behaviour you’re looking for? Answering this was a bit more interesting and involved than I expected…

First off, let’s look at some of the properties of the distributions shown in Figure 3. Here are the theoretical properties of both distributions (N.B. there are analytic formulae for these entries, which you can get off the wikipedia pages for each distribution (Loglogistic, Logistic). I converted the answers back to pIC50, for easy comparison, but the IC50s were really taken from the right hand distribution in Figure 3 in IC50 units.).

Theoretical Results |
Mean |
Median |
Mode |
---|---|---|---|

From IC50 distbn | 5.836 | 6 (μ) | 6.199 |

From pIC50 distbn | 6 (μ) | 6 (μ) | 6 (μ) |

So as you might expect (from looking at it), there’s certainly a skew introduced into the Loglogistic distribution on the right of Figure 3 for the IC50 values.

So what to measure depends what kind of ‘representative’ behaviour we are after. Or in other words, what is a drug really doing when we get these distributions for observations of its action? Well, you really want to be inferring the ‘centering’ distribution parameter (μ), which in this case would be the mean/median/mode pIC50 or the median IC50. You will already get the impression that it’s most useful to think in pIC50s – as the distribution is symmetric the mean, median and mode are all the same.

But what about the more realistic case of the properties of a handful of samples of those distributions? I just simulated that and show the results in Figure 4 (I think you could probably do this analytically on paper, given time, but I haven’t yet!).

It seems the only estimates for which you are likely to get back an unbiased and consistent estimate is the mean/median of pIC50 values, since the distribution is symmetric. The IC50 distribution isn’t symmetric, and so taking the mean of IC50 samples leads to a bias, it does however seem to give you a good estimate for the median of the IC50 distribution (better than the median of a sample of IC50s does!) for a low N – see top right plot of Figure 4. As N increases you do eventually get a distribution whose peak is at the mean, but N needs to be quite a lot larger than your average (no pun intended) experimental N. As you might expect, the median pIC50 is not quite as good a measure as the mean for the centre of the pIC50 distribution (but that’s hard to see visually in Fig 4, it does almost as well here).

You could point out that none of these “make that much difference” to the plots above, but you will introduce a bias if you use the wrong statistic, and for the semi-realistic distributions that we’ve got as an example here, your estimate of pIC50 = 5.836 versus the true pIC50 = 6 does give a block error of almost 10% when you substitute it back into a Hill curve of slope 1 at 1μM.

Importantly, pIC50s are much nicer numbers to deal with: they are always given in the same units of log M, and you can recognise at a glance for your average pharmaceutical compound whether you’re likely to have no block (<1 ish), very weak activity (2-3ish) which is perhaps just noise, low block (4-5ish) or strong block (>6ish) – give or take the concentration that the compound is going to be present at – see Figure 1! These easy-to-grasp numbers make it much easier to spot typos than it is when you’re looking at IC50s in different units. We’ve also seen parameter fitting methods that struggle with IC50s, but are happier working in log-space with pIC50s, as searching (in some sense ‘linearly’) for a pIC50 within say 9 to 2 is easier than searching for an IC50 in 1nM to 10,000μM.

So my conclusion is that I’ll **try and work with pIC50s**, and if you need a quick summary statistic, **use the mean of pIC50s**. They seem to occur in symmetric distributions with samples that behave nicely, and therefore are generally much easier to have sensible intuition about!

Now Matlab defines these arguments of (2) to be exactly the same μ and σ as (1), but another common parameterisation is that α = exp(μ) and β = 1/σ.

Since our pIC50s are in log10 we need to use the following identity:

log_{b}(a) = log_{d}(a) / log_{d}(b) = ln(a) / ln(10)

Note that if we follow the Matlab way of doing things, the following relationships are useful:

IC50 [μM] ~ Loglogistic(μ, σ) implies that pIC50 [log M] ~ Logistic(6 - μ/ln(10), σ/ln(10));

or the other way,

pIC50 [log M] ~ Logistic(μ, σ) implies that IC50 [μM] ~ Loglogistic(ln(10)*(6 - μ), ln(10)*σ).

Note that in Matlab you type ‘log’ for ‘ln’ and ‘log10’ for log_{10}. I’ve uploaded the matlab script I used for this post here.

]]>