[mcstas-users] McStas simulation of nested elliptical optic with many levels crashes

Erik B Knudsen erkn at fysik.dtu.dk
Fri Aug 20 10:43:06 CEST 2021


Dear Richard,
Off the top of my head I can't think of a straightforward way of doing 
that, other that editing interoff-lib.c and put it some reporting mechanism.

But, when I think about it again, a sensible way to handle this occurs 
to me:
The buffer overrun check that _is_ in the code checks whether the list 
of intersections exceeds 1024 (actually against the macro 
CHAR_BUF_LENGTH which has the value 1024). If so it will report that, 
and continue with the set that has so far been found.

Therefore, if you set OFF_INTERSECT_MAX to 1024 (or bigger for that 
matter) you will be alerted if the intersection list exceeds 1024, and 
if that happens often you might not be able to fully trust the results.

It is unlikely that the intersection list length will have a significant 
impact on the memory footprint of the simulations, so it should be safe 
to increase the number of 1024.

I should perhaps also stress that the overrun check in the development 
tree now properly checks against OFF_INTERSECT_MAX as it should.

I hope that made sense to you
cheers
Erik


On 19/08/2021 16:30, Richard Wagner wrote:
> Dear Erik,
> 
> we tested the workaround you proposed and can confirm that it so far works.
> 
> Simulations that failed previously, now run smoothly with no crashes so far.
> Many thanks for the help!
> 
> Is there a way to check - the amount of intersections reached during a 
> simulation?
> It would give us the info whether we are at the limit our still in calm 
> waters.
> 
> Bye
> 
> Richard
> 
> On 18/08/2021 13:54, Erik B Knudsen wrote:
>> Dear Richard
>> (writing to the mailing list as well as it may interest others)
>> I have now understood what happens: Your issue is caused by a buffer 
>> overrun. If the number of intersections computed by off_intersect 
>> exceeds 100 - this results in undefined behaviour.
>> The number 100 is hardcoded (via a #define) within the interoff library.
>>
>> In your particular case, it appears that when a neutron is on a 
>> trajectory towards a corner of your square channels it may reflect 
>> many times, each time with a higher angle, and eventually be on a 
>> trajectory which intersects many polygons, thus triggering the buffer 
>> overrun. Also this error was not properly caught in the code.
>>
>> I'll increase the number of possible intersections in the development 
>> tree, and fix the check. That way it'll get better in the next release.
>>
>> In the meantime, as a workaround, you may edit the generated c-file
>> or the interoff-lib.h file found in your McStas-tree.
>> Change the line
>>  #define OFF_INTERSECT_MAX 100
>> to something bigger such as
>>  #define OFF_INTERSECT_MAX 1000
>>
>> Any bigger than 1024 is not really meaningful because that is what the 
>> overrun-check checks for :-).
>>
>> This makes the code run and it appears to work also with your original 
>> file, i.e. the smaller ones. So that's good I suppose.
>>
>> cheers
>> Erik
>>
>>
>> On 12/07/2021 13:39, Richard Wagner wrote:
>>> Dear Erik,
>>>
>>> the random seed has an influence on the point (in time) when the 
>>> simulation crashes.
>>> That means for certain seeds a simulation run will survive and crash 
>>> only after repeating the run with another seed,
>>> or depending on our simulation we have a crash at a run were we added 
>>> more nested optical elements.
>>>
>>> If the seed is held fix and an error occurs, it is repeated each run 
>>> and becomes reproducible.
>>>
>>> Hope that info helps to shed light on the issue.
>>>
>>> Best,
>>>
>>> Richard
>>>
>>>
>>> On 07/07/2021 14:56, Erik B Knudsen wrote:
>>>> Dear Richard,
>>>> That is unfortunate. We will investigate the matter on our end.
>>>> From what you describe - it sounds like there is a memory leak 
>>>> somewhere.
>>>> Is the "point of crash" always the same across simulation runs or 
>>>> does it vary (as I would expect)? I.e. does it depend on the random 
>>>> seed?
>>>>
>>>> Also thank you for the very thorough report btw. If only they were 
>>>> all like this.
>>>>
>>>> cheers
>>>> Erik
>>>>
>>>> On 07/07/2021 14:08, Richard Wagner wrote:
>>>>> Dear all,
>>>>>
>>>>> here is a follow up.
>>>>>
>>>>> In short crashes still occur.
>>>>>
>>>>> Of our many tests I will point out two scenarios:
>>>>>
>>>>> a) double planar optic with corrected OFF File. Now the surfaces 
>>>>> don't intersect each other anymore they have only common sides. 
>>>>> With this approach the number of points and faces explodes as one 
>>>>> goes to a higher number of levels (vertices scale with the square 
>>>>> of the levels)
>>>>>
>>>>> The simulation start to take noticeably longer (several minutes 
>>>>> compared two ~20 sec). and we get a crash after increasing to 90 
>>>>> levels with the message
>>>>>
>>>>> # McStas 2.7 - Nov. 27, 2020: [pid 4509] Signal 11 detected SIGSEGV 
>>>>> (Mem Error)
>>>>>
>>>>> # Simulation: NNb (NNb_90.instr)
>>>>>
>>>>> # Breakpoint: end_event (Trace) 0.68 % ( 6771.0/ 1000000.0)
>>>>>
>>>>> # Date: Wed Jul 7 14:03:06 2021
>>>>>
>>>>> # Started: Wed Jul 7 14:03:00 2021
>>>>>
>>>>> # Last I/O Error: No such file or directory
>>>>>
>>>>> b) a Single mono Planar mirror
>>>>>
>>>>> The simulation crashes with either of the following messages, when 
>>>>> the amount of nested levels is above 115.
>>>>>
>>>>> # McStas 2.7 - Nov. 27, 2020: [pid 98724] Signal 11 detected 
>>>>> SIGSEGV (Mem Error)
>>>>>
>>>>> # Simulation: NNb (NNb_115.instr)
>>>>>
>>>>> # Breakpoint: nested (Trace) 0.18 % ( 1821.0/ 1000000.0)
>>>>>
>>>>> # Date: Wed Jul 7 12:57:49 2021
>>>>>
>>>>> # Started: Wed Jul 7 12:57:49 2021
>>>>>
>>>>> # Last I/O Error: No such file or directory
>>>>>
>>>>> # McStas 2.7 - Nov. 27, 2020: [pid 1196] Signal 11 detected SIGSEGV 
>>>>> (Mem Error) # Simulation: NNb (NNb.instr) # Breakpoint: end_event 
>>>>> (Trace) 2.17 % ( 21733.0/ 1000000.0)
>>>>>
>>>>> Any ideas, which memory runs out?
>>>>>
>>>>> We made the observation that when running the same simulation with 
>>>>> McStasScript we run into the crash earlier (level 75 instead of the 
>>>>> 115)
>>>>>
>>>>> I put the off files of the two cases in the attachment. The 
>>>>> instrument files is the same during all tests (means only the OFF 
>>>>> of the Guide_anyshape) is changed
>>>>>
>>>>> Hope for any ideas, how to proceed further.
>>>>>
>>>>> Thanks in advance,
>>>>>
>>>>> Best,
>>>>>
>>>>> Richard
>>>>>
>>>>> On 05/07/2021 11:55, Erik B Knudsen wrote:
>>>>>> Dear Richard,
>>>>>> Very good to hear. For sure, overlapping polygons is expected to 
>>>>>> create problems and undefined behaviour. The underlying routine 
>>>>>> that checks polygon intersections does not check for consistency, 
>>>>>> and instead only checks which polygon it hits first, which could 
>>>>>> be the wrong one.
>>>>>> That the problem only arises at a certain level of complexity is 
>>>>>> perhaps connected to the sheer number of overlaps becoming large - 
>>>>>> but that is merely guesswork on my part.
>>>>>>
>>>>>> Looking forward to hearing the further output of your efforts.
>>>>>>
>>>>>> cheers
>>>>>> Erik
>>>>>>
>>>>>> On 02/07/2021 13:30, Richard Wagner wrote:
>>>>>>> Dear all,
>>>>>>>
>>>>>>> we had in the meantime a detailed look in the matter from our 
>>>>>>> side and suspect know a problem with the OFF-file
>>>>>>>
>>>>>>> If we use a mono planar version of the optic (see example in the 
>>>>>>> attachment) we can increase the nested levels ad lib.
>>>>>>> We are only limited by the fact that the spacing between the 
>>>>>>> levels is becoming so small, that it makes no sense to continue.
>>>>>>>
>>>>>>> A major difference between the single and double planar optic is 
>>>>>>> that the double planar OFF-File has surfaces that intersect each 
>>>>>>> other.
>>>>>>>
>>>>>>> I don't know if this is "allowed" or if it can cause McStas to 
>>>>>>> have problems during the simulation.
>>>>>>> Until a certain level of complexity it worked fine.
>>>>>>>
>>>>>>> What we are currently doing is to change the function (written in 
>>>>>>> python), that generates the double planar OFF-file to generate 
>>>>>>> additional vertexes and faces at the intersection of the 
>>>>>>> surfaces. So we end up having a cleanly defined optic with no 
>>>>>>> 'undefined intersections'.
>>>>>>>
>>>>>>> I will report the outcome of this changes.
>>>>>>>
>>>>>>> Best,
>>>>>>>
>>>>>>> Richard
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 02/07/2021 09:35, Erik B Knudsen wrote:
>>>>>>>> Dear Richard,
>>>>>>>> Thank you for the thorough and detailed report. We will take a 
>>>>>>>> look at your problem asap and report back.
>>>>>>>> cheers
>>>>>>>> Erik
>>>>>>>>
>>>>>>>> On 29/06/2021 18:31, Richard Wagner wrote:
>>>>>>>>> Dear McStas experts,
>>>>>>>>>
>>>>>>>>> We are currently doing simulations with nested elliptical 
>>>>>>>>> optics and so far things ran quiet smoothly.
>>>>>>>>>
>>>>>>>>> We generate the OFF file for the optic ourselves and use the 
>>>>>>>>> Guide_anyshape component.
>>>>>>>>>
>>>>>>>>> We start with an outer layer and continue to add inner layers 
>>>>>>>>> one at a time.
>>>>>>>>> If we then get to optical components that have a high number of 
>>>>>>>>> levels we run into the problem, that McStas crashes resp. 
>>>>>>>>> aborts the simulation.
>>>>>>>>>
>>>>>>>>> Output in that case reads:
>>>>>>>>>
>>>>>>>>> # McStas 2.7 - Nov. 27, 2020: [pid 64818] Signal 11 detected 
>>>>>>>>> SIGSEGV (Mem Error)
>>>>>>>>> # Simulation: NNb (NNb.instr)
>>>>>>>>> # Breakpoint: psd_monitor (Trace) 2.46 % ( 24574.0/ 1000000.0)
>>>>>>>>> # Date:      Tue Jun 29 17:44:13 2021
>>>>>>>>> # Started:   Tue Jun 29 17:44:13 2021
>>>>>>>>> # Last I/O Error: No such file or directory
>>>>>>>>> # McStas 2.7 - Nov. 27, 2020: Simulation stop (abort).
>>>>>>>>>
>>>>>>>>> Or
>>>>>>>>>
>>>>>>>>> # McStas 2.7 - Nov. 27, 2020: [pid 66573] Signal 10 detected 
>>>>>>>>> [proc 0] SIGBUS (Bus error)
>>>>>>>>> # Simulation: NNb (NNb.instr)
>>>>>>>>> # Breakpoint: nested (Trace) 85.56 % (  855555.0/ 1000000.0)
>>>>>>>>> # Date:      Tue Jun 29 18:00:32 2021
>>>>>>>>> # Started:   Tue Jun 29 18:00:28 2021
>>>>>>>>> # Last I/O Error: No such file or directory
>>>>>>>>>
>>>>>>>>> There are many messages such as the following in the Mcstas 
>>>>>>>>> Window, too:
>>>>>>>>>
>>>>>>>>> Guide_anyshape: nested: Warning: Reflectivity R=7.02318 > 1 
>>>>>>>>> lowered to R=1.
>>>>>>>>> Guide_anyshape: nested: Warning: Reflectivity R=7.02365 > 1 
>>>>>>>>> lowered to R=1.
>>>>>>>>> Guide_anyshape: nested: Warning: Reflectivity R=7.02412 > 1 
>>>>>>>>> lowered to R=1.
>>>>>>>>> Guide_anyshape: nested: Warning: Reflectivity R=7.0246 > 1 
>>>>>>>>> lowered to R=1.
>>>>>>>>>
>>>>>>>>> I put an example of an instrument file (+OFF , +Source 
>>>>>>>>> Component) of a failed run for a 1m optic in the attachment.
>>>>>>>>>
>>>>>>>>> The trace run for instrument visualization works.
>>>>>>>>> We only run into the problem for short optics <=2 m, were a 
>>>>>>>>> high number of nested levels is needed to completely cover the 
>>>>>>>>> cross section.
>>>>>>>>>
>>>>>>>>> We run into the problem on Ubuntu 18.04 and MacOs Big Sur 
>>>>>>>>> machines.
>>>>>>>>>
>>>>>>>>> Any ideas what's the problem? Are the spacing of the elliptical 
>>>>>>>>> getting to narrow, perhaps?
>>>>>>>>>
>>>>>>>>> Thanks in advance,
>>>>>>>>>
>>>>>>>>> Richard
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -- 
>>>>>>>>> *Richard Wagner*
>>>>>>>>> Research Engineer
>>>>>>>>> Nuclear and Particle Physics Group
>>>>>>>>> Institut Laue-Langevin - ILL
>>>>>>>>> 71, avenue des Martyrs
>>>>>>>>> CS 20156
>>>>>>>>> 38042 Grenoble Cedex 9
>>>>>>>>> France
>>>>>>>>>
>>>>>>>>> www.ill.eu <www.ill.eu>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> mcstas-users mailing list
>>>>>>>>> mcstas-users at mcstas.org
>>>>>>>>> https://mailman2.mcstas.org/mailman/listinfo/mcstas-users
>>>>>>>>>
>>>>>>>>
>>>>>>> -- 
>>>>>>> *Richard Wagner*
>>>>>>> Research Engineer
>>>>>>> Nuclear and Particle Physics Group
>>>>>>> Institut Laue-Langevin - ILL
>>>>>>> 71, avenue des Martyrs
>>>>>>> CS 20156
>>>>>>> 38042 Grenoble Cedex 9
>>>>>>> France
>>>>>>>
>>>>>>> www.ill.eu <www.ill.eu>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> mcstas-users mailing list
>>>>>>> mcstas-users at mcstas.org
>>>>>>> https://mailman2.mcstas.org/mailman/listinfo/mcstas-users
>>>>>>>
>>>>>>
>>>>> -- 
>>>>> *Richard Wagner*
>>>>> Research Engineer
>>>>> Nuclear and Particle Physics Group
>>>>> Institut Laue-Langevin - ILL
>>>>> 71, avenue des Martyrs
>>>>> CS 20156
>>>>> 38042 Grenoble Cedex 9
>>>>> France
>>>>>
>>>>> www.ill.eu <www.ill.eu>
>>>>>
>>>>
>>> -- 
>>> *Richard Wagner*
>>> Research Engineer
>>> Nuclear and Particle Physics Group
>>> Institut Laue-Langevin - ILL
>>> 71, avenue des Martyrs
>>> CS 20156
>>> 38042 Grenoble Cedex 9
>>> France
>>>
>>> www.ill.eu <www.ill.eu>
>>>
>>>
>>
> -- 
> *Richard Wagner*
> Research Engineer
> Nuclear and Particle Physics Group
> Institut Laue-Langevin - ILL
> 71, avenue des Martyrs
> CS 20156
> 38042 Grenoble Cedex 9
> France
> 
> www.ill.eu <www.ill.eu>
> 
> 

-- 
Erik Bergbäck Knudsen, Research Engineer         | DTU | morituri
NEXMAP, DTU Fysik, DK-2800 Kgs. Lyngby, Denmark  |<>-<>|    te
phone: (+45) 2132 6655                           |<>-<>| salutant


More information about the mcstas-users mailing list