1. Computer problem? Tech Support Guy is completely free -- paid for by advertisers and donations. Click here to join today! If you're new to Tech Support Guy, we highly recommend that you visit our Guide for New Members.

A little insight from the microsoft guys... *long*

Discussion in 'Networking' started by Ddruid, Jan 28, 2005.

Thread Status:
Not open for further replies.
Advertisement
  1. Ddruid

    Ddruid Thread Starter

    Joined:
    Jan 25, 2004
    Messages:
    74
    You will have to forgive my ignorance when it comes to MS products/technologies. If you don't want to read any of the debugging steps/details skip to the paragraph above the code statements.

    This issue has already been resolved, but I would like to get some insight into some possible causes...

    Equipment/background: Cisco 3660 w/OC3, and 4 T1's (one in use). This is a shared router owned by the ILEC (we are owned by the ILEC so we take care of the telco ip network), configured with ATM circuts for each provider. Customers inbound from the OC3 from a Nortel DSLAM are seen as individual atm circuts per provider ie atm3/0.1 = all of provider ones customers, atm3/0.2 = all of provider 2's customers etc. Customers over the T1 are seen in individual atm pvc's ie atm1/0.1 = customer 1, etc. Each provider is given a virtual bridge with their customers assigned to their bridge group and routed to dedicated ethernet ports. Most customers have bridged (UGGHH!!) modems.

    Recently we began seeing high spikes in traffic over the T1 circut (1500+pkts/s), accompaning this incident was a spike on the OC3 circut of 2-4X the traffic(3000-6000pkts/s) and the load on the cisco would reach 100%. What was not being seen was that traffic hitting ANY ethernet port, it was all staying within the ATM interfaces. Unfortunantly the DSLAM's in use would not allow us any sort of statistical information on a per customers basis (or atleast the CO tech was unable to provide me with any). This left diagnosing the problem to the cisco, unfrtunantly this router is in a remote pop 70 miles away, so doing an ip debug from remote was out of the question... Becuase the traffic was limited to the ATM interfaces I was not able do any IP debuging, the only clues I had to work with were A)The cisco configuration was gone over multiple times, no errors, ACL's were unable to stop the traffic B) It was broadcast traffic, the T1 was seeing high dropped packet counts and all PVC's (T1&OC3) were seeing a high packet count. C) The cisco was munching on the directed broadcasts causing the high load D) It was likely being caused by someone/something over the T1 because the OC3 appeared to be having refelection issues (the 2-4x traffic issue). E) Possible worm/virus..

    Since I couldn't capture the traffic on the Ethernet segments and I couldn't get any info from the telco, I had to capture the packets on the ATM side. Solution: I built a linux box with multiple ethernet cards, installed the box at the CO with multiple DSL connections using bridged modems. one from each provider. This in effect put me on the ATM side of the BVI's. I setup a script to monitor the cisco's load average also to monitor packet loss and latency. When the box detected a spike it would capture 2000pkts from each interface.

    The crux of the matter: This revealed that we were infact seeing netbios storms. The storms are nothing I have seen before.. A) The 'NAME' changed after a given time. and B) The directed broadcast changed over time, it would start at the correct broadcast 192.168.1.255, then it would move to 192.168.255.255, and finally to 192.255.255.255.

    Ip's and Mac's changed to protect the guilty.
    Code:
    00:08:23.218828 00:03:5d:xx:xx:xx > Broadcast, ethertype IPv4 (0x0800), length 110: IP (tos 0x0, ttl  20, id 768, offset 0, f
    lags [none], proto 17, length: 96) 192.168.1.133.netbios-ns > 192.168.255.255.netbios-ns: [udp sum ok]
    >>> NBT UDP PACKET(137): REGISTRATION; REQUEST; BROADCAST
    TrnID=0x0
    OpCode=5
    NmFlags=0x11
    Rcode=0
    QueryCount=1
    AnswerCount=0
    AuthorityCount=0
    AddressRecCount=1
    QuestionRecords:
    Name=N1X0F7          NameType=0x00 (Workstation)
    QuestionType=0x20
    QuestionClass=0x1
    
    Name changed:
    00:08:24.015081 00:03:5d:xx:xx:xx > Broadcast, ethertype IPv4 (0x0800), length 110: IP (tos 0x0, ttl  20, id 1024, offset 0,
    flags [none], proto 17, length: 96) 192.168.1.133.netbios-ns > 192.168.255.255.netbios-ns: [udp sum ok]
    >>> NBT UDP PACKET(137): REGISTRATION; REQUEST; BROADCAST
    TrnID=0x2
    OpCode=5
    NmFlags=0x11
    Rcode=0
    QueryCount=1
    AnswerCount=0
    AuthorityCount=0
    AddressRecCount=1
    QuestionRecords:
    Name=WORKGROUP       NameType=0x00 (Workstation)
    QuestionType=0x20
    QuestionClass=0x1
    
    A little later, the IP changed (probably couldn't reach the DHCP server), the name changed, and the broadcast changed.
    00:34:31.139397 00:03:5d:xx:xx:xx > Broadcast, ethertype IPv4 (0x0800), length 110: IP (tos 0x0, ttl  31, id 5632, offset 0,
    flags [none], proto 17, length: 96) 192.168.1.87.netbios-ns > 192.255.255.255.netbios-ns: [udp sum ok]
    >>> NBT UDP PACKET(137): REGISTRATION; REQUEST; BROADCAST
    TrnID=0x8
    OpCode=5
    NmFlags=0x11
    Rcode=0
    QueryCount=1
    AnswerCount=0
    AuthorityCount=0
    AddressRecCount=1
    QuestionRecords:
    Name=V6P7K8          NameType=0x00 (Workstation)
    QuestionType=0x20
    QuestionClass=0x1
    
    Finally my questions: Is this typical netbios behavior? What could have caused this to occur? There was only ONE computer at these customers locations, so I was not seeing 'leaked' traffic. These customers were NOT infected with viruses/trojans atleast none that were detected by NAV2005 as of 1/19. None had file/print sharing enabled. Disabling Netbios over TCP/IP did not work. Things like modems and hardware in the DSLAM's were changed, customers moved to new cable pairs etc to rule out possible line faults/flaky hardware. It was absolutely being caused by windows. Putting them behind NAT did not work either.

    Solutions: Temp solution was to install routed modems at the customer premise with firewall rules to block all netbios traffic from reaching our network. Permanent solution, I am replacing the aging 3660 with an Imagestream router and implementing PPPoE.

    I fought with this network for a couple of weeks, primarily due to the fact that it was random in occurance and duration, it would last 2 minutes or 2 hours at any time of the day.. The customers I setup with the firewaled modems are STILL sending a huge amount of netbios traffic according to the firewall statistics.

    Have I ever mentioned I dislike windows? Have I ever mentioned I dislike bridged networks (the real cause of this issue)?
     
  2. 10forcash

    10forcash

    Joined:
    Aug 7, 2003
    Messages:
    343
    Ddruid,
    Nice post !
    Have a similar problem although using VSAT, as I understand it, the NETBIOS is an inherent problem with Microsoft, are your'e T1's in the public domain? most Windows OS's will broadcast NETBIOS packets as part of the NLA, to determine any possible connections, the problem seems to be that if there are several PC's broadcasting NETBIOS, they will talk to each other along the lines of "who's at IP xxx.xxx.xxx.xxx ? I don't know - can you check?" this then initiates the other PC(s) to send to that address. Although NETBIOS is meant for non- routable address tables, it doesn't understand routable addresses and will look anyway. Eventually, the NETBIOS storms will disappear, at least until a new PC appears and the process starts all over again....
    Bridging modems certainly doesn't help, like small children, NETBIOS loves to see what's on the other side....
    Hope this helps

    Cheers,
    10forcash
     
  3. Ddruid

    Ddruid Thread Starter

    Joined:
    Jan 25, 2004
    Messages:
    74
    Thanks for the reply!

    This particular T1 is a point to point connection from the CO to a remote SLICK (catena) that serves dialtone & dsl. The circut itself is actually an OC3 that we broke out a T1 for data.

    What you described is kind of what we were seeing. I noticed it would occasionaly begin sometime after I would see a dhcp request come through. The one thing that is different, and it's my fault for not mentioning this in the first post, is I would see 100-200 netbios pkts/s coming from one or more computers during this time frame. Does Netbios continue to send requests if it doesn't get a reply? Shouldn't it time out? I have noticed one customer's NB traffic being a 'Domain Controller' could this be the initiator?

    The ISP portion of the company, unfortunantly, was not around when this network/pop was setup, they contacted another ISP to do the design. The telco bought and sold only USB bridged modems for a year before we were formed and took over control. The telco has agreed to allow us to replace our customers modems, but we cannot force the other ISP's to replace their customers modems.. We can however, force them to use PPPoE :D

    Here is a cricket graph of the CPU percentage from the cisco (the pink line is from losing power during an ice storm), I wish I would have snagged the dialy & weekly graphs when this was going on as it better showed just how randomly it was happening, it could be quiet all day then at midnight start and last an hour or start at noon and last 15 minutes. This is the monthly graph so it's an average per-day.

    [​IMG]

    Thanks again for taking the time to read all that and taking the time to respond!
     
  4. 10forcash

    10forcash

    Joined:
    Aug 7, 2003
    Messages:
    343
    Ddruid,
    NETBIOS does have a default timeout of 3600 seconds on client systems, the DC will forward any NETBIOS requests from client PC's to the IP it's looking for, thereby increasing the freqency and randomness of the packets. There is also a retry interval which is anti - logarithmic, but I can't recall the exact formula... The best I can suggest is to resolve the domain and contact the owner to block outbound NETBIOS and DHCP at their firewall, unfortunately, as it is a 'push' protcol, theres not a lot you can do to prevent your modems accepting this type of traffic even if your firewall prevents progulamation. PPoE should cure this for you though!

    Cheers,
    10forcash

    p.s. OC3's - long time since I played with them, but definitely the way forward!!
     
  5. Ddruid

    Ddruid Thread Starter

    Joined:
    Jan 25, 2004
    Messages:
    74
    Thanks again, I contacted the customer and the guys went out and replaced their modem with one I setup. The customer was infact a small business with about 6 computers w/ICS setup..

    I hate implementing PPPoE on the existing customers, primarily due to the fact that people are so against change, it's just alot easier to start them off using pppoe, but they'll just have to deal with it..

    We are actually upgrading the majority of our fiber links to OC-12's. A ton more capacity than we'll ever need in the near future, but the equipment (SONET) is a lot easier to interface data & voice over the trunk. Instead of breaking T1's out, they can break it out as ethernet. :eek:
     
  6. 10forcash

    10forcash

    Joined:
    Aug 7, 2003
    Messages:
    343
    Ddruid,
    Nice one!
    Hope it goes as planned - don't worry, they'll get used to it!
    getting rid of those bridges will give you some peaceful nights..
    Cheers,
    10forcash
     
  7. Sponsor

As Seen On
As Seen On...

Welcome to Tech Support Guy!

Are you looking for the solution to your computer problem? Join our site today to ask your question. This site is completely free -- paid for by advertisers and donations.

If you're not already familiar with forums, watch our Welcome Guide to get started.

Join over 733,556 other people just like you!

Thread Status:
Not open for further replies.

Short URL to this thread: https://techguy.org/324506

  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice