Announcement

Collapse
No announcement yet.

SERIOUS: Maths, System unable to allocate memory resource

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • SERIOUS: Maths, System unable to allocate memory resource

    Hi,

    We are seeing the following error running Burn In Test. Here is a log of the run...any ideas? I assume since it is "Maths" it has something to do with CPU, but could you point us in the right direction? I am told we have already changed CPUs, motherboard, and memory so far...

    CPU/RAM @ 90%
    HDD @ 100%
    Network @ 90%

    Not worried about the single network error obviously.

    Thanks,
    S


    PassMark BurnInTest Log file - http://www.passmark.com
    ================================================== ======

    Date: 12/21/09 09:19:34

    BurnInTest V6.0 Pro 1017
    (64-bit)Technician: xxx
    Logging detail level: Normal

    **************
    SYSTEM SUMMARY
    **************
    Windows 7 Ultimate Edition build 7600 (64-bit),
    2 x Intel(R) Xeon(R) CPU E5530 @ 2.40GHz,
    48.0GB RAM,
    3724GB HDD,
    CD/DVDRW,

    GENERAL
    System Name: MININT-241F761
    Motherboard Manufacturer: Supermicro
    Motherboard Name: X8DTN
    Motherboard Version: 1.1
    BIOS Manufacturer: American Megatrends Inc.
    BIOS Version: 080015
    BIOS Release Date: 10/07/2009
    Serial number: xxxxxx

    CPU
    CPU manufacturer: GenuineIntel
    CPU Type: Intel(R) Xeon(R) CPU E5530 @ 2.40GHz
    Codename: Gainestown
    CPUID: CPU1: Family 6, Model 1A, Stepping 5, Revision D0
    CPUID: CPU2: Family 6, Model 1A, Stepping 5, Revision D0
    Socket: LGA1366
    Lithography: 45nm
    Physical CPU's: 2
    Cores per CPU: 4
    Hyperthreading: Enabled
    CPU features: MMX SSE SSE2 SSE3 SSSE3 SSE4.1 SSE4.2 DEP PAE Intel64 VMX
    Clock frequencies:
    Measured CPU speed: 2401.3 MHz
    Multiplier: x18.0
    Base Clock: 133.3 MHz
    Muliplier range: Min: x12, Max non turbo: x18
    Cache per CPU package:
    L1 Instruction Cache: 4 x 32 KB
    L1 Data Cache: 4 x 32 KB
    L2 Cache: 4 x 256 KB
    L3 Cache: 8 MB
    TDP Limit: 80 Watts
    TDC Limit: 70 Amps

    MEMORY
    Total Physical Memory: 49143MB
    Available Physical Memory: 46781MB
    Memory devices:
    0:
    - 4096MB, 1024MHz,
    1:
    - 4096MB, 1066MHz,
    2:
    - Not populated
    3:
    - 4096MB, 1066MHz,
    4:
    - 4096MB, 1066MHz,
    5:
    - Not populated
    6:
    - 4096MB, 1066MHz,
    7:
    - 4096MB, 1066MHz,
    8:
    - Not populated
    9:
    - 4096MB, 1066MHz,
    10:
    - 4096MB, 1066MHz,
    - Serial number: 
    11:
    - Not populated
    12:
    - 4096MB, 1066MHz,
    13:
    - 4096MB, 1066MHz,
    14:
    - Not populated
    15:
    - 4096MB, 1066MHz,
    16:
    - DDR3, 4096MB, 1066MHz,
    17:
    - Not populated

    DISK VOLUMES
    C: Local drive, NTFS, (3724.00GB total, 3723.80GB free)
    D: Optical drive, DVD-ROM UJDA780
    X: Local drive, Boot, NTFS, (0.03GB total, 0.03GB free)

    DISK DRIVES
    Fixed disk (Size: 3724.00GB)

    OPTICAL DRIVES
    D: MATSHITA DVD-ROM UJDA780 (CD/DVDRW)

    NETWORK
    Intel(R) 82576 Gigabit Dual Port Network Connection
    Intel(R) 82576 Gigabit Dual Port Network Connection

    PORTS

    **************
    RESULT SUMMARY
    **************
    Test Start time: Fri Dec 18 11:24:17 2009
    Test Stop time: Mon Dec 21 08:55:50 2009
    Test Duration: 069h 31m 33s
    Temperature CPU 1 average (Min/Current/Max): 36.5C / 48.3C / 51.8C
    Temperature CPU 1 core 1 (Min/Current/Max): 39.0C / 52.0C / 54.0C
    Temperature CPU 1 core 2 (Min/Current/Max): 33.0C / 47.0C / 50.0C
    Temperature CPU 1 core 3 (Min/Current/Max): 39.0C / 48.0C / 53.0C
    Temperature CPU 1 core 4 (Min/Current/Max): 35.0C / 46.0C / 50.0C
    Temperature CPU 2 average (Min/Current/Max): NA / NA / NA
    Temperature CPU 2 core 1 (Min/Current/Max): NA / NA / NA
    Temperature CPU 2 core 2 (Min/Current/Max): NA / NA / NA
    Temperature CPU 2 core 3 (Min/Current/Max): NA / NA / NA
    Temperature CPU 2 core 4 (Min/Current/Max): NA / NA / NA

    Test Name Cycles Operations Result Errors Last Error
    CPU 59131 817 Trillion FAIL 3272 System unable to allocate memory resource
    Memory (RAM) 482 47.028 Trillion PASS 0 No errors
    Network 1 8607 68.863 Million FAIL 1 Timeout waiting for packet
    Disk (C 26 2.103 Trillion PASS 0 No errors
    TEST RUN FAILED

    ******************
    DETAILED EVENT LOG
    ******************
    SERIOUS: 2009-12-18 11:38:03, Maths, System unable to allocate memory resource
    SERIOUS: 2009-12-18 11:38:03, Maths, System unable to allocate memory resource
    SERIOUS: 2009-12-18 11:38:03, Maths, System unable to allocate memory resource

    .
    .
    .
    (cut for brevity)
    .
    .
    .
    SERIOUS: 2009-12-18 11:41:01, Maths, System unable to allocate memory resource
    LOG NOTE: 2009-12-19 10:27:23, Network 1, User set Network timeout exceeded.
    SERIOUS: 2009-12-19 10:27:24, Network, Timeout waiting for packet
    LOG NOTE: 2009-12-21 08:55:50, Status, Test run stopped

  • #2
    This basically means that the CPU test is unable to allocate memory for the small test data set it uses for the CPU tests. As this occurs after a long time, this is most likely a memory leak (memory allocated but not released by software).

    Are you running any other software at the same time as BurnInTest? If so, I would close it.

    We have had a report of a 'disk activity logging' device driver with a memory leak, where the BurnInTest disk test activity provokes a problem in the device driver where it allocates, but does not free the memory, and eventually the system runs out of memory.

    I think this unlikely to be a hardware issue, but may be a software issue like a device driver (or possibly even a BurnInTest issue).

    To help investigate, can you try the test
    1) without the disk test
    2) without the disk test and without the Network test
    3) without the disk test, the Network test and the CPU temperature logging.

    I would turn on 1 minute periodic logging and trace level 1 logging as this will output the available memory every minute. I would also open task mananger and look at memory usage during the different tests.

    I notice you have a small X: drive. Are you running under Microsoft WinPE?

    This should help work out where the problem lies.

    Best regards,
    Ian

    Comment


    • #3
      Thanks for the input!

      Just so you know, we are running Burn In Test from WinPE, and hence, no Task Manager, no other software, etc. We are just running "bit.exe -x" from the command line. Does that make a difference?

      Edit: Oh, and out of 250+ systems, this is the first we're seeing this issue with under WinPE. This was also happening with build 1014...we are now using 1017, same issue.

      Edit #2: Apparently the system had to be shipped as is...just went to try what you suggested but the customer took it already. If I see the issue again, I will try your suggestions and post back to this thread. Thanks.

      Thanks,
      S
      Last edited by negated; 12-21-2009, 11:53 PM.

      Comment


      • #4
        In some cases, there may be different drivers in a WinPE build when compared to a full Windows build. However, if you are using the default WinPE build, then I think it makes it less likely that it is some obscure device driver issue, like the disk activity logging device driver I mentioned. In saying this I still think it is most likely to be a memory leak.

        Regards,
        Ian

        Comment


        • #5
          I finally have another machine that is exhibiting the same problem as above. I was able to stop the error from occurring by not running the HDD test (set at 100%). The HDDs in question are a RAID-1 volume via the onboard ICH10R controller on the SuperMicro X8DTN+ motherboard AND a RAID-5 volume via a 3Ware 9650SE controller.

          Running the HDD test on any one "drive" did not cause the error, but running the test on both drives causes the issue each time (starts having these problems in about 45-50 minutes).

          This system also has 48GB of RAM, and we are using the 64-bit variant of build 1017 (was also happening with 1014).

          Any thoughts?

          Thanks,
          -S

          Comment


          • #6
            I would first suspect that a RAID driver has a memory leak and the system runs out of memory (45minutes is pretty quick for a memory leak problem).

            Did the 1 minute periodic logging show a declining amount of memory and/ declining amount of virtual memory?
            Have you seen the problem when not using WinPE?
            What is the configuration of the disks and can you send me your configuration file (.bitcfg)?

            You can send files to the email address shown on this page:
            http://www.passmark.com/support/index.htm

            Thanks.
            Ian

            Comment


            • #7
              Originally posted by Ian (PassMark) View Post
              I would first suspect that a RAID driver has a memory leak and the system runs out of memory (45minutes is pretty quick for a memory leak problem).
              We know it's not (just) the RAID driver because we actually test dozens of systems each month with the same environment and RAID card/driver for multiple days without failure. It only seems to be the introduction of another controller on top of that that we see the issue, and even then it only seems to be with 2 specific systems so far (out of hundreds).

              Originally posted by Ian (PassMark) View Post
              Did the 1 minute periodic logging show a declining amount of memory and/ declining amount of virtual memory?
              Have you seen the problem when not using WinPE?
              What is the configuration of the disks and can you send me your configuration file (.bitcfg)?

              You can send files to the email address shown on this page:
              http://www.passmark.com/support/index.htm

              Thanks.
              Ian
              I will shoot you the bitcfg as soon as I get a chance; I don't have the log file (that is just the log file that is shown in the third tab of the interface, correct?) for that specific machine, but when we see the issue again I will send both files. I didn't notice what the last log said about the virtual memory.

              I don't think we've seen the issue without WinPE, but then again, most of our systems are required to be tested with WinPE since we sell (most) systems without OSes and we don't have time to install OSes on each system.

              Thanks for getting back to me! It is appreciated.
              -S

              Comment


              • #8
                Yes, the log file output is configured in the 3rd tab, Preferences->Logging, Trace file detail level = trace level 1 (or 2 for more detail), periodic logging = 1 minute, result files are *.trace.

                Regards,
                Ian

                Comment

                Working...
                X