Inclusion fee is part of the total fee of one transaction. It is meant to cover all costs of one transaction’s handling except for the execution part, which will be charged as the execution fee. Inclusion fee is supposed to cover resource utilization (network transmission, memory usage etc.) overhead after one transaction is submitted. Inclusion fee must be calculated without one transaction being executed.
The goal of this proposal is to come up with a formula incentivizing developers to create simpler transactions which will cost less, compared with more complex ones.

In the original PR, we are targeting average sized transaction (1800 bytes) to be charged for original 1e-6 FLOW. Now there’s an on-going discussion if we should target medium-sized (~1200 bytes) transactions for the same amount, as the new inclusion fee.

For more context, here is the distribution of our mainnet transactions.

I think we should go with the median transaction size to establish the baseline fees because of the right-skewed distribution. This will create a more reasonable baseline since it will give us closer to 50% of transactions being below the fee.

Given the current data, what % of transactions is <=1200 bytes and 1800 bytes?

Another thing to think about, we could choose “round” (base-2) numbers for this to help with reasoning if its close. E.g. size the transactions so that a 1kb transaction maps to the current inclusion fee.

There’s also a potential argument that inclusion fees should on average be higher as time goes on because more nodes means there is more gossip overhead.

Don’t have much information about this FLIP but just wanted to say average is correct here.

Median usage here would be PR talk, ( e.g., more than 50% of the transactions will cost less )

Total Cost = AverageTxSize * NumberOfTransactions

Considering NumberOfTransactions is irrelevant here, correlation between AverageTxSize and Total Cost is important.

Imagine this, if I was spamming the network in huge amounts of 0-100 bytes transactions, median would be one of those, then by using median, we would make transactions very expensive.

A little more details on how the final equation was derived, specifically for the C in

1.0 = C * (6.370641e-7 * TxByteSize + 2.585227e-3)

With the average size - 1800B, C is 267.956977406 (as in the original FLIP);
With the medium size - 1200B (I need to run another BigQuery to get exact number but it shouldn’t be too much from 1200), C is 298.533847732.

So using medium the inclusion fee will generate extra 11.14% than using average - is that still a concern considering potential small-size tx spamming?

I think @bluesign’s point here is actually more related to the idea that the distribution is manipulable. No matter which statistical function we choose someone could attempt to spam the network to change the distribution used in our sample.

So regardless the analysis needs to correct for any bad behavior in the sample with some assumptions. My statement here re: median vs average is that IF the distriction is truly the real world distribution over time then median is the more correct function to choose.

As far as I can see looking at various months and samples of data the distribution of txn sizes does indeed match what you have here, but if course that’s just right now and doesn’t speak to change in user behavior.

If the goal of the flip is to have it be that the meat of the distribution is charged the same as today then this feels appropriate.

@pgpg I also believe the average is a fine metric to use for a dynamic inclusion fee given we assume that overtime the “tx_size distribution” will progress towards a normal distribution (progress towards is key word here, not exactly ever being normal!). This should promote a market based approach where the inclusion fees are ultimately dictated by the node operators and users of the network, so if we were to examine a 95% confidence interval over that distribution, we know that these folks are going to be largely unaffected given current TPS rates.

One concern is manipulation of the distribution to alter the average inclusion fee, as in the possibility of someone executing a ton of small transactions to offset the average of dynamic inclusion fees (push it right skewed) such that one certain large transaction is now cheaper. Now this is highly unlikely given the volume of small transactions required to do so and this one large transaction would need to bring an exorbitant amount of value… but if there does exist a bimodal distribution in the future then it may be advantageous to do so

In this FLIP all parameters of the equation are static, so it only reflects the distribution of our transactions at the creation time of this FLIP. But distribution manipulation attack will be a practical concern when we will have adaptive dynamic inclusion fee in the future, when parameters can be automatically adjusted with instant txn size distribution in real-time.