I implemented a block on level 2 s-function (implemented in m file) that does the same job as the built-in state-space function. When I compared the result, they produced the same result. However when I compared the elapsed time I noticed that my s function run slower than simulink build in block (in particular if the size of systems is very large (4000 states)). I noticed that the long simulation time occurs before simulink taking the first step of integration. After the first step, it run at comparable speed to state-space build in block. I realized it may be caused by the inefficiency in implementation. But any other reason for this?