Friday, January 20, 2012

packet receiving in windows kernel


Under multicore system, an ethernet nic may distribute its interrupt to multiple cores. In this situations, how can kernel guarantee that the packets are delivered to the tcp/ip stack in order? Especially, consider packet processing in the kernel are splitted into ISR and DPC (bottom half in linux kernel) phases. Here is the article (Non-RSS Receive Processing) from msdn that clearly describes the packet receiving process and you can see how in-order delivery is guaranteed.




Saturday, January 07, 2012

core dump in windows

One of my favorite feature I miss in linux is the core dump. When a program crashes, the os takes a snapshot of its memory image of the crashed program and put into a file usually named 'core'. The core file is extremely usefully for debugging purpose, since it contains the state when  the program crashes, such as the stack trace.

After switching to windows, I was always wanting to find same feature. Until recently, I ran across an article in msdn, "Collecting User-Mode Dumps". On windows vista and after, you can create such dump file by modifying system registers.

Bingo!

Saturday, December 17, 2011

painful debugging due to forget to cancel asynchronous IO operations

Under windows, you can issue asynchronous IO operations to the kernel. sometimes, you are impatient  to wait for the completion of the operation. Then you need to use CancelIoEx() to cancel this operation.

My recent code does not cancel the operation and I simply free the hEvent associated with the asynchronous IO operation. The consequence is extremely severe. The stack of my program has been tampered. I experienced STATUS_STACK_BUFFER_OVERRUN error.

It was a futile effort and I spent lots of time understand the STATUS_STACK_BUFFER_OVERRUN error and lots of time examine the assembly code. Then I found out the code stack has been modified by someone else. Then I began to re-check the entire code logic and finally caught the culprit.

Thursday, December 15, 2011

re-enable classic start menu on windows server 8

I am not a big fan for the metro-style UI on windows 8, especially on windows server 8. Therefore the very first thing after I installed a windows 8 server. I try to turn off the metro-style UI and get the classic start menu back. Here is how:

In regedit,

HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Explorer

change value of RPEnabled from "1" to "0"

Sunday, December 04, 2011

Calling conventions of VC

VC has three calling conventions
  • /Gd, the default setting, specifies the __cdecl calling convention for all functions except C++ member functions and functions marked __stdcall or __fastcall.
  • /Gr specifies the __fastcall calling convention for all functions except C++ member sfunctions and functions marked __cdecl or __stdcall. All __fastcall functions must have prototypes.
  • /Gz specifies the __stdcall calling convention for all functions except C++ member functions and functions marked __cdecl or __fastcall. All __stdcall functions must have prototypes.
Decorated Names for different calling conventions:
  • For C, the __cdecl naming convention uses the function name preceded by an underscore ( _ ).  Generally the function arguments are passed on the stack in reverse order so that the callee can access them in the correct order. The caller is responsible for popping the arguments after the function returns, which makes it possible to use the ... to send runtime defined arguments. Return values are returned in the registers.
     _functionname
  • For C, the __fastcall naming convention uses the function name preceded by an at sign (@) followed by the size of the function's arguments in bytes. Some of a __fastcall function's arguments are passed in registers (for x86 processors, ECX and EDX), and the rest are pushed onto the stack from right to left. The called routine pops these arguments from the stack before it returns. @function_name@number
  • For C, the __stdcall naming convention uses the function name preceded by an underscore ( _ ) and followed by an at sign (@) and the size of the function's arguments in bytes. A __stdcall function's arguments are pushed onto the stack from right to left, and the called function pops these arguments from the stack before it returns. _functionname@number

Calling Conventions supported by VC






Monday, November 28, 2011

EINTR

long time ago, I knew that certain system call will returned EINTR during block. But I did not know why, so forget to ask why. Until recently, I start to think about this problem, the answer turns out to be in a long time ago favorable book APUE.

"A characteristic of earlier UNIX systems is that if a process caught a signal while the process was blocked in a "slow" system call, the system call was interrupted. The system call returned an error and errno was set to EINTR. This was done under the assumption that since a signal occurred and the process caught it, there is a good chance that something has happened that should wake up the blocked system call.
" --Section 10.5. Richard Stevens.

Thursday, November 17, 2011

openflow indigo artchitecture

Indigo is project to produce OpenFlow enabled hardware switch. Here is the software architecture of OpenFlow enabled hardware switch they posted on their website.


Here we can see several long running processes. But there is no details about the functionality of each process. Which serves for the openflow controller? Here is the missing information. As you can see, the ofprotocol and ofdatapath actually talk to the openflow controllers and monitors.

Wednesday, November 02, 2011

identify the cause of high cpu utilization via xperf




1. Install xperf. Now xperf is in windows SDK. The setup file is wpt_x64.exe for amd64.

2. Start tracing
xperf -on latency -stackwalk Profile

3. After a while, stop tracing and generate trace file trace.etl.
xperf -d trace.etl

4. analyze the result. You need to provide pdb file to show the function name.
xperf trace.etl

As you can see sal_usleep() is the culprit and takes most of CPU time.



After we fix sal_usleep(), the CPU utilization drops significantly.


Another powerful feature of xperf is stack walking. It will output call stacks of a function. More information can be find here (stack walking) and here (enable stack walking on x64).

Friday, September 30, 2011

sending packets from multi-home host



The Next Generation TCP/IP stack in Windows Vista and Windows Server 2008 supports strong host sends and receives for both IPv4 and IPv6 by default on all interfaces except the Teredo tunneling interface for a Teredo host-specific relay. 

Tuesday, September 20, 2011

You are unable to view Roles and Features and receive error code 0x800706BE in Server Manager

check this article.


1. Microsoft Update Readiness Tool. http://support.microsoft.com/kb/947821
2. Check C:\Windows\logs\CBS\Checksur.log
3. takeown /F c:\Windows\Servicing\Packages /D y /R
4. cacls c:\Windows\Servicing\Packages /E /T /C /G "UserName":F
5. Download the KB Files for the missing files:
6. Expand -F:* UpdateKBXXXX.msu x:\DestinationDirectory
7. Expand -F:* UpdateKBXXXX.CAB x:\DestinationDirectoryCAB
8. Copy missing or corrupted files.

Tuesday, August 02, 2011

HTPC


AMD’s APU Fusion solution

image

  • Single or Dual-core 64-bit processors codenamed Ontario, Zacate and with the followings:
    • made on 40 nm CMOS process
    • support for DDR3 1066 MHz memory
    • 9W or 18W TDP
  • Radeon HD 6xxx GPU on 40 nm process

Mobile chipset: A50M (Hudson-M1) 4.7W TDP

Hudson Chipset comparision

Overall, AMD E-350 series platforms is suitable for HTPC. It can do hardware HD decoding. Overall power is around 40W. There are plenty motherboard built around E-350 platforms. We’ll introduce them in next blog.

Friday, April 22, 2011

linux time in hyper-v

windows Bios does not support UTC hardware time. Therefore, the Bios hardware time for a windows server is always set to local time. However, linux treats that time as UTC by default. Thus for a linux virtual machine running in hyper-v it will treat the localtime as UTC cause incorrect time.

Here is what you should do:

set in /etc/default/rcS UTC=no

Then use tzselect to select correct time zone.

Tuesday, February 22, 2011

ioctl method

the difference among METHOD_BUFFERED, METHOD_IN_DIRECT and METHOD_OUT_DIRECT.

osronline has an article about this, check here.

codeproject has an example about this, check here.

Monday, January 03, 2011

Switch system volumn in windows 2008 R2

First, here is the definition for system volume and boot volume under windows.

Second, here is step to change system volume from HDD3 to HDD1.

  1. The boot files consist of a folder called "Boot" and an application called "bootmgr". They are system files, and so are normally hidden. You must make them visible.
  2. Copy them both from HDD3 to HDD1. It will complain that two files cannot be copied: "BCD" and "BCD.log", both of which are in the "Boot" folder. The latter is unimportant (its just a log file) and can be ignored; the former will be copied in the next step.
  3. Open a command prompt with Administrator privileges. Type the following command "bcdedit /export HDD1:\Boot\BCD", where HDD1 is the driver letter of your destination drive. In my case, HDD1 was C, so the command was "bcdedit /export C:\Boot\BCD".
  4. Now, you have to edit the BCD file so that the entries are all correct. I used the bcdedit program which comes with Windows 7, but there are others. I've heard EasyBCD is pretty good (and free), but the latest version doesn't fully support Windows 7. I believe a new version that does support Windows 7 will be out shortly. First, I navigate to HDD1\Boot on the command prompt (with Admin privileges) and type "bcdedit /store BCD /enum ALL". I use the /store switch to specify the BCD file in the current directory - omitting this switch will use the default one I think, which is on HDD3. This command lists all the entries in the BCD file.
  5. I scanned the entries, and it turns out that the entries for {bootmgr} and {memdiag} both point to HDD3. Of course, I want them to point to HDD1. So I issue commands in the following form: "bcdedit /store BCD /set [entry name] device partition=HDD1:". Since HDD1 is C on my machine, I would type the following to set the {bootmgr} entry: "bcdedit /store BCD /set {bootmgr} device partition=C:".

That's about it. One last step is to make HDD1 Active via Disk Management (accessed from Computer Management) - it was already Active on my system, so I didn't have to do anything.

Wednesday, December 01, 2010

tracking bcm source code

First, import original bcm code into our repository

1. find . –type d | xargs cvs add

2. find . –type f | grep –v CVS | xargs cvs add

Get CVSed source code from repository

3. cvs co sswitch

Create branch

4. cvs tag –b sdk566-patches

5. cvs update –r sdk566-patches

Modify the branch

Update trunk to sdk580

Creeate a new branch

6. cvs tag –b sdk580-patches

7. cvs update –r sdk580-patches

Incoporate changes in branch to the truck

8. cvs update –j sdk566-patches

Modify the code.

Wednesday, September 08, 2010

ubuntu 10.04: sluggish console on hyper-v

I’m very disappointed to see that the server edition console is unbearably slow under hyper-v. To work around this issue, you need to disable the frame buffer module:

edit /etc/modprobe.d/blacklist-framebuffer.conf and add the following line:

blacklist vga16fb

Reboot and the console should be fine.

Sunday, August 08, 2010

memory latency under Nehalem arch

Memory latency matters. As we can see, accessing memory cause 65ns to 106ns in Nehalem arch.

This figure is from paper by Daniel Molka, “Memory Performance and Cache Coherency Effects on an Intel Nehalem Multiprocessor System”, Intl Conf on Parallel Architectures and Compilatoin Techniques, 2009.

image

Tuesday, June 29, 2010

Microsoft Failover Cluster

Here are basic articles for using hyper-v and microsoft failover cluster:

Hyper-V: Using Hyper-V and Failover Clustering

Hyper-V: Using Live Migration with Cluster Shared Volumes in Windows Server 2008 R2

Failover Clusters

Here are some usefully techniques to manage microsoft failover cluster:

1. Forcibly removing failover cluster feature after cluster failure.

2. Duplicated MAC address for Microsoft Failover Cluster Virtual Miniport driver. In our testbed, the OS are cloned, so they have same MAC addresss of Failover cluster virtual miniport driver. This prevents from two nodes joining the cluster. To join them, we must reinstall failover cluster feature on these machines.

3. How to create the cluster.log in Windows Server 2008 Failover Clustering

Thursday, June 03, 2010

Sharing Memory Between Drivers and Applications

PVOID 
CreateAndMapMemory()
{
PVOID buffer;
PMDL mdl;
PVOID userVAToReturn;

//
// Allocate a 4K buffer to share with the application
//

buffer = ExAllocatePoolWithTag(NonPagedPool, PAGE_SIZE, 'MpaM ');

if(!buffer) {
return(NULL);
}

//
// Allocate and initalize an MDL that describes the buffer
//

mdl = IoAllocateMdl(buffer,
PAGE_SIZE,
FALSE,
FALSE,
NULL);

if(!mdl) {
ExFreePool(buffer);
return(NULL);
}

//
// Finish building the MDL -- Fill in the "page portion "
//

MmBuildMdlForNonPagedPool(mdl);

//
// The preferred V5 way to map the buffer into user space
//

userVAToReturn =
MmMapLockedPagesSpecifyCache(mdl, // MDL
UserMode, // Mode
MmCached, // Caching
NULL, // Address
FALSE, // Bugcheck?
NormalPagePriority); // Priority

//
// If we get NULL back, the request didn 't work.
// I 'm thinkin ' that 's better than a bug check anyday.
//

if(!userVAToReturn) {
IoFreeMdl(mdl);
ExFreePool(buffer);
return(NULL);
}

//
// Store away both the mapped VA and the MDL address, so that
// later we can call MmUnmapLockedPages(StoredPointer, StoredMdl)
//

StoredPointer = userVAToReturn;
StoredMdl = mdl;

DbgPrint( "UserVA = 0x%0x\n ", userVAToReturn);

return(userVAToReturn);
}


程序工作原理:



驱动程序可以使用任意标准的方法来分配要共享的缓冲,如果没有特殊的要求并且大小适度,可以将它分配在非分页池中。





驱动程序使用IoAllocateMdl()分配一个MDL来描述这个缓冲,然后调用MmBuildMdlForNonPagedPool()。这个函数修改MDL以描述内核模式中一个非分页内存区域。





当用来描述共享缓冲的MDL建立起来以后,驱动程序现在可以准备将缓冲映射到用户进程的地址空间了,由MmMapLockedPagesSpecifyCache() 这个函数完成。


你必须要在你想要映射共享缓冲的进程上下文环境中调用MmMapLockedPagesSpecifyCache(),并且指定AccessMode参数为UserMode。这个函数返回由MDL映射的用户态虚拟地址。 驱动程序可以把这个值作为用户程序发送IOCTL请求时的返回值返回给用户程序。



注意:IoAllocateMd只分配MDL,并不负责更新MDL里面内容中的page numbers,需要使用MmBuildMdlForNonPagedPool来全完初始化。



ref: A Common Topic Explained - Sharing Memory Between Drivers and Applications

Sunday, May 30, 2010

NPU

Broadcom

     XGS Core Product Line

          BCM88025

          BCM8823x

          BCM88235

Ethernity

          ENET3x00

          ENET4x00

EZchip

          NPA

          NP-3

          NP-4

LSI

          ACP3448

          APP3300

Netronome

          IXP2855 (Intel Castine)

          NFP-3216

          NFP-3240

TPack

          TPX3103

          TPX4004

          TPX5104

Wintegra

          WinPath2

          WinPath2-Lite

          WinPath3

          WinPath3-SL

Xelerated

          AX310

          HX320

          HX330

Legacy Vendors:

     AppliedMicro

          nP37x0

          nP3705

     Exar (Hifn)

          5NP4G

     Mindspeed

          M27479

          M27480

          M27481