Saturday, January 12, 2008

Temperature Modeling, part 2

Since my last “Temperature Modeling” paper came out, there was a lot of good discussion concerning the paper itself, the findings, the spreadsheet, the theory, and a lot of other concepts intrinsically bound to the airmass estimation process.  I am going to try to address as many of these issues here for further clarification, as you guys really can bring up a lot of good points forcing me to go back, rethink, recheck, reevaluate a lot of previous notions.

The spreadsheet accompanying the paper is NOT a solution to the bias tuning problem.  It is merely a simulation, a toy helping to understand the process of estimating temperature.  It was my mistake to use the word ‘optimize’ anywhere near it, as a lot of people thought you can use it to figure out what you need to put in for the values in Bias and Filter tables.  It is not.  It is there purely for education and entertainment.  I helped me a lot in understanding how the bias and filter work together to simulate smooth transitions of temperature of air in the intake.

What you can ‘optimize’ with it, is the model itself.  While we have the tables containing data, we do not have the formulas governing them.  For example, in the Bias table, intervals are 10g/sec (i.e. on the older F-body computers) or 4g/sec (on the E40 and newer computers).  Are the values contained in the cells for the border value itself, or the centers of the range around them (32g/sec would be a center of 30..34g/sec on an E40 ECU)?  What happens when the value we need to look up is not in the cell directly?  Is it rounded to the nearest known value?  Is it interpolated?  What kind of interpolation:  linear, spline, cubic, Bezier?  Until we start hacking the code on the ECU itself, we will not know answers to these questions, however, a spreadsheet like this allows us to play with the different scenarios and see what works and what does not.  Of course this is not a proof of anything, but the changes should be visible enough to see some interesting correlations.

Another important thing that came up during the creation of the spreadsheet is the final metric, how we evaluate the goodness of the model we conjectured.  The traditional metric in HPT or EFILive is expressed by (AFR-AFRwb)/AFR.  So when a metric of a fit is needed for a range of values, most people employ the average of the AFR%Error or BEN.  The problem with that is these errors can be both positive and negative, so they can eliminate each other numerically.  For example if you just used the traditional AFR%Error histogram and see 0%, you automatically jump to the conclusion you have the perfect tune, while in fact it might have been roughly the same number of errors on positive and negative sides.  Considering that measuring entities often follows the normal distribution, the errors on both sides will quite likely be in the business of pushing the average toward 0.  This is probably why it takes a few passes to get the fuel trims in the neighbourhood of where they should be to begin with.  This made me decide that a better metric is needed to assess the different models we come up with.  Just putting an absolute value on the AFR%Error will give you only positive values, making all errors ‘grow’ the average, not drive it toward a zero average.  Also, since the number resulting from such a computation is an actual percent of difference, you can sum or even average the errors, getting a better idea on the nature of the errors in your system.  Such a system would work just as intended with a great number of clean samples.  However, in real world, lots of data is hard to gather, as well as outliers occur fairly frequently, throwing off the averages.  Thus, someone had an idea of squaring the difference (AFR-AFRwb)2 between the empirical data (AFRwb) and the intended target (AFR).  This way, the small errors would contribute very little to the sum, while the larger errors would carry a significant weight with them.  This approach is great for spotting systems with a large number of outliers, as they make the sum grow quickly.  The problem is that too many outliers will weigh so much that the real data will become almost completely ignored.  Yet another metric is the maximum error (max(AFRwb-AFR)).  The good aspect of it is that while the errors might not be small, they’ll all be within a certain range from the target. 

No matter which of these metrics are used, the goal is the same:  to make our system predict/describe an observed function with more precision.  Certain metrics are going to skew results in different ways, but altogether they improve the model.  The important thing here is that there is no one ultimate metric, we have to know and understand few different ones, and use the ones that make most sense for what we’re trying to do.  In the case of tuning AFR, you don’t want to go too lean or too rich; you want to keep the discrepancies close to the target AFR.  For such a situation I like to use the maximum deviation metric, as it tries to ‘squeeze’ all the errors under the same level.  However, I found practically that it’s not a good metric to start with, as the computer was having problems with finding an optimal answer starting from a wild guess.  Sum of squared errors metric was much better for the initial pass, to get the parameters close to what we’d like it to be.  With these initial parameters, the maximum deviation metric was much more useful.

So what is all this metric talk for really?  The idea of optimization is to minimize errors.  However, you cannot make a decision on which model is better, until you can put a number on every model.  The interesting part is that in the end it does not matter what it is that you’re minimizing, the maximum deviation or sum of squared errors, as long as it is smaller than before, progress was made.

That is how I set up the spreadsheet.  There are multiple models, with multiple metrics assessing the fit.  Parameters in all of them are the same, but strangely enough, different models like different parameters, which only further stresses the importance of figuring out the full model, not just the parameter values.  The Solver in Excel is quite good, albeit tricky sometimes, at arriving at the best set of parameters.  It is however very intriguing to watch how the parameters change (or not!) depending on which metric was being minimized.  The best part is to watch the charts change; MATbias, MATfilter, MATscanned, alongside with ECT, IAT and Airflow all dancing around, and the more you optimize, the closer the MATfilter would get to MATscanned.  Another interesting thing is to see the more drastic changes in airflow (acceleration, deceleration) rapidly swing the estimated temperature one way or another.  Depending on which metrics get optimized, certain patterns on the graph are either followed closer, while others get ignored altogether.  I highly recommend spending some quality time playing with the different models and metrics, to get a real feel for what we’re dealing with here.

We cannot forget what this spreadsheet is however.  It is an estimator of an estimator.  We do not, and will not know the real temperature inside of the intake, until we use an old-fashioned empirical method of sticking an extraneous probe into the bowels of the intake somewhere.  At best, if all the math and modeling we developed is perfectly the same as the one in the ECU, the MATscanned and my MATcalculated would be the same.  Making them the same does not optimize the model to report the proper temperature; it only allows me to play with the components in such a way that I predict the result with greater degree of confidence.  IAT, ECT, and the Bias and Filter tables are all a part of an estimator—that is all they do.


The purpose of all this mathematical gymnastics is to learn how the computer estimates the temperature in the intake.  With understanding comes control.  Control allows experiments and simulations to take place and either prove a theory or debunk a myth.  If our model is consistently close enough to MATscanned though, it would be a good time to try the optimized parameters in the tune, and see if it follows what the numbers intended to yield. 

The real complication is that it looks like we will not get VE tuning without arriving at the correct Bias tables, and we will also not get Bias without a solid VE table.  I am trying to do it together, hoping that setting them to correct and realistic values would yield least errors across the full simulation, but it is neither obvious to understand nor easy to implement.

 

Few other points that came across very strongly in all the discussions:  Proper attribution is key:  other factors (i.e. timing, transitions) can have a significant influence on AFR, and the changes due to temperatures are lost in the noise from other factors (hpt  thread 14996)

This temperature accounting discussions brought out few new people:  For example, one dude named Adrian had this simple, but very profound idea.  If MAT=IAT+(ECT-IAT)*BIAS then MAT-IAT=(ECT-IAT)*BIAS, so with a MAT probe we could determine BIAS.  If MAT=IAT, then BIAS=0, which is nice, because not only it satisfies scientific formulas, but also makes common sense.  With an extra temperature probe in the intake (probe part number GM PART # 25036751) we could arrive at BIAS in a very simple way.  Any takers?

Another interesting point that came up in the discussions was the reason for the whole temperature estimation model, instead of a simple probe mounted inside of the intake.  Would it be reasonable to think that the whole complex estimator was created because probes in MAT were measuring the temperature of the intake housing, and not the air inside?

Another guy, Phillip, did some experiments and brought up to light that the temperatures do not affect MAF readings, while SD is clearly affected by temperatures.  This means that with a proper MAF calibration, VE calibration can be obtained by mapping out the MAF airflow on the traditional RPM/MAP axis, and back calculating VE based on the temperature estimates.  This is interesting, as one of my first contributions to the tuning community was the idea that MAF tuning is easy, because all you have to do is map out the airflow resulting from the VE-based SD calculations onto the MAF scale, which is the exact reverse of what this newest batch of research suggests.

A thing that bugs me about the various bias tables is how they differ from one model year to another.  There are straight lines, two sloped straight lines, lines that look like exponential delay… ultimately, they all point at the same patterns:  at small airflow numbers the bias line starts toward ECT, and quickly works toward IAT, asymptotically leveling off at the end.  To make estimating the bias curve possible, we must have some sort of idea for the function.  Exponential decay curve makes for such a line, as it has all the properties we need (BIAS=a*e(-k*MAF)+b).   Another great thing about the exponential decay curve is that we can manipulate its shape using three parameters, making the fitting process about as complicated as a polynomial or even a straight line, but results in a fairly complex curve shape.

 

Tunes often have tables that are either unused, or used in irrelevant ways.  For example there’s a voltage multiplier for IFR, which is just set to 1.0 on most cars.  However, on the 4 cylinder cars you can use it to achieve the effective IFR higher than its natural limit of 64 lbs/hr.  On all other platforms I tried, changing these values has no effect.  Such ‘occasionally functional’ tables make me wonder what is the purpose of adding a dimension of speed to the Bias table in the E40 ECU on the 05-06 GTO.  The Bias values do not change along the speed axis, making it effectively useless.  Is the full table used, and the calibration just does not take speed into account, or is the computer still using only one row of values, and the rest of the table is unused?  It is not that extra complicated to account for it, but I do not want to introduce variables to the model that are simply not there.

 

Most discussions about system modeling end up with someone attempting to oversimplify a complex system.  Reducing GM’s MAF/SD hybrid to pure MAF or pure SD to tune a particular table is an example of simplification in the name of gaining control.  On the other hand, we have people setting the Bias table to one value across the entire table, simply because they do not understand how the system works.  The combination of the Bias and Filter tables is a great example of how an ‘overcomplicated’ system can easily (and properly!) become a simpler, older version of the same system.  You can think of it as an evolutionary process, where evolution was forced upon insufficient precision of the older, simpler systems.  In the beginning there was a single IAT probe which was used for air charge’s temperature estimation.  It probably did not work too well, so they decided to move this probe into the intake, hoping to be ‘closer to the action.’  This also turned out to be a fiasco, as the probe ended up measuring the temperature of the intake more so than the air inside the intake.  The decision was made that since you cannot apparently measure the temperature in the intake, then it would be better to estimate it, as it is going to be between the two sources of the extreme temperatures in the engine bay—the airbox and the coolant.  A first attempt at this estimator was made using the Bias table alone, a simple matter of proportioning the two temperatures based on airflow involved.  If you simulate it, the sequence of temperatures resulting from such proportioning yields rather choppy and abrupt final temperature function.  Since most things in nature have smooth transitions, such a model demonstrated a need for a smoother output.  Such an output was provided by introducing the Filter table into the model.  The more airflow is involved, the quicker the temperature will converge to the newly present conditions.  If that was not complicated enough, the Bias table have grown a dimension to accommodate changes in bias depending on speed of the vehicle.  I would guess this is due to the fact that at higher speeds the incoming air not only gains density, but it also has lower temperature of the air charge.  So the system have evolved from a single sensor with no corrections on it, to a monster with three sensors (IAT, ECT, SPEED), and two tables (Bias is 2D, Filter is 1D), proportioning and dampening the inputs to create an estimator of temperature in the intake.  Let’s say we have a car with the fully evolved implementation of this estimator, but we have no clue how to calibrate all the necessary tables.  However, we might know how to deal with a simpler system, let’s say it’s the one without the filter table and with a 1-dimensional Bias table.  If we fully understand the system (even if we cannot control it) we can set the values of the Filter table to 1 (instant change) practically eliminating it from the system.  Then, if the Bias table’s speed dependent values are set to be identical for each airflow, the Bias table would become its own simpler version, fully emulating the behavior of an older system; hopefully it is close enough to yield good results, and simple enough that it is solvable. 

Here is the best example of how not to simplify the complex systems.  This is picked from some older bias adjusting discussions on the HPTuners’ forum:

“well..my cylinder bias temp is 1.0 across the board for now..and because they are all the same value my filter settings basically dont matter you would think this is a bad idea..but sometimes de-engineering GM and simplifying is a lot better than trying to use thoer whole dam slew of BS they throw in there..LOL”