<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Garshasp Development Blog &#187; Engine</title>
	<atom:link href="http://dev.garshasp.ir/blog/archives/category/engine/feed" rel="self" type="application/rss+xml" />
	<link>http://dev.garshasp.ir/blog</link>
	<description>The stuff that our lives are made of!</description>
	<lastBuildDate>Wed, 18 May 2011 07:01:13 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Sharif CGS 2010 Presentations</title>
		<link>http://dev.garshasp.ir/blog/archives/192</link>
		<comments>http://dev.garshasp.ir/blog/archives/192#comments</comments>
		<pubDate>Fri, 23 Apr 2010 09:53:12 +0000</pubDate>
		<dc:creator>Yaser Zhian</dc:creator>
				<category><![CDATA[code]]></category>
		<category><![CDATA[Engine]]></category>
		<category><![CDATA[Exhibition]]></category>
		<category><![CDATA[Low-level]]></category>
		<category><![CDATA[R&D]]></category>
		<category><![CDATA[debugging]]></category>
		<category><![CDATA[presentation]]></category>
		<category><![CDATA[production workflow]]></category>

		<guid isPermaLink="false">http://dev.garshasp.ir/blog/?p=192</guid>
		<description><![CDATA[(Well, one of them for now&#8230;) I thought I&#8217;d post the presentation I used for my talk in the conference on game development held at Sharif University a couple of months back. It&#8217;s about the decisions you have to make and the things that you should do at the beginning of a game project, to [...]]]></description>
			<content:encoded><![CDATA[<p>(Well, one of them for now&#8230;)<br />
I thought I&#8217;d post the <a href="http://yaserzt.com/blog/wp-content/uploads/2010/04/CGS10-yzt-PlanningForDebuggingDay-rev05.pdf">presentation</a> I used for my talk in the <a href="http://cgs.sharif.ir/">conference on game development</a> held at Sharif University a couple of months back.<br />
It&#8217;s about the decisions you have to make and the things that you should do at the beginning of a game project, to make your team&#8217;s and your own lives easier later on and throughout the development cycle. This is an ongoing experience and collection of ideas for me, so I&#8217;ll be looking forward to any suggestions, discussions, critique, humiliation, praise and/or whatnot!</p>
]]></content:encoded>
			<wfw:commentRss>http://dev.garshasp.ir/blog/archives/192/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Run Please! (or how do we enjoy a cup of coffee while the bugs are being chased.)</title>
		<link>http://dev.garshasp.ir/blog/archives/184</link>
		<comments>http://dev.garshasp.ir/blog/archives/184#comments</comments>
		<pubDate>Fri, 26 Mar 2010 16:17:54 +0000</pubDate>
		<dc:creator>Yaser Zhian</dc:creator>
				<category><![CDATA[code]]></category>
		<category><![CDATA[Engine]]></category>
		<category><![CDATA[General]]></category>
		<category><![CDATA[No beach for you!]]></category>
		<category><![CDATA[debugging]]></category>
		<category><![CDATA[Zorvan]]></category>

		<guid isPermaLink="false">http://dev.garshasp.ir/blog/?p=184</guid>
		<description><![CDATA[In the past weeks, a very common sight at Fanafzar was a series of 4 or 5 machines, all running Garshasp on a pre-recorded command sequence (or timedemo, or whatever you might want to call it,) trying to get the game to crash or fire an assertion or behave erratically to help us pinpoint some [...]]]></description>
			<content:encoded><![CDATA[<p>In the past weeks, a very common sight at Fanafzar was a series of 4 or 5 machines, all running Garshasp on a pre-recorded command sequence (or timedemo, or whatever you might want to call it,) trying to get the game to crash or fire an assertion or behave erratically to help us pinpoint some intermittent or hard to reproduce bug.</p>
<p>First of all, we have a quite cool replay feature in Zorvan (our engine) that lets us record and then play back a game session. It&#8217;s not perfect, and not quite fit for end-users, but you wouldn&#8217;t imagine how useful it has been (and will be) to us.</p>
<p>This replay system is serving as our unit test (&#8220;You added that feature? Let&#8217;s see if the boat sequence is playable now.&#8221;) and our regression test (&#8220;You committed that fix? Let&#8217;s see if the game is still playable!!!&#8221;) and our performance test and playability test and much more. Since the structure of our game is linear by design, we can get very good coverage with a straightforward replaying of the entire game.</p>
<p>In any case, since we are almost feature-frozen now, our (programmer&#8217;s) lives mostly consist of running the game till it crashes (or does something it shouldn&#8217;t do) and then tracking down the bug and working it out.</p>
<p>I guess the next step will be finding the few major performance bottlenecks and optimizing them (that we have put off till now because they would have made debugging and adding features quite hard.)</p>
<p>My point in all this was that debugging is usually considered a gruesome and intimidating task, or boring and uninteresting at best. Right now, I quite enjoy debugging our engine and game for two main reasons: first is the replay system (which makes debugging much more effective, targeted and efficient) and second and more important is finding out bugs in my own (and our own) mental processes by finding bugs in the code that resulted from those processes.</p>
<p>It can be illuminating to find out what you had missed when you designed or implemented a piece of code, or the bugs caused by lack of communication or a problem in the general work flow (these are not common, but interesting nonetheless.) This form of revelation that results from finding a bug in your code is quite a rush and can make us better programmers.</p>
]]></content:encoded>
			<wfw:commentRss>http://dev.garshasp.ir/blog/archives/184/feed</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Optimize</title>
		<link>http://dev.garshasp.ir/blog/archives/159</link>
		<comments>http://dev.garshasp.ir/blog/archives/159#comments</comments>
		<pubDate>Fri, 15 Jan 2010 10:37:08 +0000</pubDate>
		<dc:creator>fassihi</dc:creator>
				<category><![CDATA[Engine]]></category>
		<category><![CDATA[optimization]]></category>

		<guid isPermaLink="false">http://dev.garshasp.ir/blog/?p=159</guid>
		<description><![CDATA[Optimization is the current focus of the technical team. There are three main areas that we are aiming to optimize in order to broaden the PC users which will be able to play the game with acceptable frame rate. The first area is graphics card Video Memory usage, mainly related to the textures and vertex [...]]]></description>
			<content:encoded><![CDATA[<p>Optimization is the current focus of the technical team. There are three main areas that we are aiming to optimize in order to broaden the PC users which will be able to play the game with acceptable frame rate. The first area is graphics card Video Memory usage, mainly related to the textures and vertex buffers which need to be loaded in the video memory. As it appears, the first part of the game is consuming a lot of the video memory due to loading the vertex buffers. PIX from DirectX has proven to be a highly valuable tool to profile the Graphics card memory. Although for more serious profiling we are using NvPerfHud from Nvidia.</p>
<p>The second area is the RAM usage. GlowCode is helping us find memory leaks. Our target goal for system memroy is 1Gig.</p>
<p>The third area is CPU consumption and the performance of individual function calls. Intel VTune is a real useful tool in this are for measuring code run time. De-synchronizing some sub system loops with the main graphics loop is among the main things needed to be done to free up some CPU time. An example for this de-synchronization is to update the AI loop once in a half second.</p>
<p>On the business side of things, negotiations with distribution channels have already started.</p>
]]></content:encoded>
			<wfw:commentRss>http://dev.garshasp.ir/blog/archives/159/feed</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Workshop Schedule</title>
		<link>http://dev.garshasp.ir/blog/archives/144</link>
		<comments>http://dev.garshasp.ir/blog/archives/144#comments</comments>
		<pubDate>Sun, 27 Sep 2009 07:22:10 +0000</pubDate>
		<dc:creator>fassihi</dc:creator>
				<category><![CDATA[3D character]]></category>
		<category><![CDATA[code]]></category>
		<category><![CDATA[Concept Art]]></category>
		<category><![CDATA[Editor]]></category>
		<category><![CDATA[Engine]]></category>
		<category><![CDATA[Game Design]]></category>
		<category><![CDATA[Production]]></category>

		<guid isPermaLink="false">http://dev.garshasp.ir/blog/?p=144</guid>
		<description><![CDATA[The schedule for the workshops held by the Garshasp development team for the third Digital Media Festival in Tehran can be found here: http://garshasp.ir/node/178]]></description>
			<content:encoded><![CDATA[<p>The schedule for the workshops held by the Garshasp development team for the third Digital Media Festival in Tehran can be found here:</p>
<p><a href="http://garshasp.ir/node/178" target="_blank">http://garshasp.ir/node/178</a></p>
]]></content:encoded>
			<wfw:commentRss>http://dev.garshasp.ir/blog/archives/144/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Sound Journeys</title>
		<link>http://dev.garshasp.ir/blog/archives/141</link>
		<comments>http://dev.garshasp.ir/blog/archives/141#comments</comments>
		<pubDate>Tue, 22 Sep 2009 21:44:45 +0000</pubDate>
		<dc:creator>fassihi</dc:creator>
				<category><![CDATA[Engine]]></category>

		<guid isPermaLink="false">http://dev.garshasp.ir/blog/?p=141</guid>
		<description><![CDATA[One of the (one of many) &#8220;What went wrong&#8221; issues in our project has been late integration of sound into the project. Here is how an issue was triggered which has been keeping us wondering about sound for a few weeks now. The first sound files we tried to integrate for our characters have been [...]]]></description>
			<content:encoded><![CDATA[<p>One of the (one of many) &#8220;What went wrong&#8221; issues in our project has been late integration of sound into the project. Here is how an issue was triggered which has been keeping us wondering about sound for a few weeks now.</p>
<p>The first sound files we tried to integrate for our characters have been stereo sounds (containing two channels). OpenAL, which is the sound library we are currently using, will not handle 3D sound management for stereo sounds, meaning there will be no attenuation of sound simulated for such audio files. This kept us wondering why no sound fall off is felt when the camera gets away from the characters, after lots of debugging and tweaking every parameter in OpenAL, we lost our trust in the internals of the library and decided to implement our own sound attenuation logic to handle the fall off in the higher game code layers.</p>
<p>Using our own attenuation handling methods meant we did not need to use the build in attenuation techniques of OpenAL and hence we commented out the sound attenuation technique set in the initialization part. This technique was an inverse algorithm, one of the three popular methods for calculating sound attenuation. Doing so enabled the default attenuation for OpenAL which is the linear algorithm.</p>
<p>The next round of sounds we integrated in the game where all mono channel sounds, which is the correct way to make character sounds. The mysteries started at this stage, sound was heard on some machines and nothing was heard on others, leading us to guess about anything and everything that could go wrong from code lines to hardware specs.</p>
<p>What we did not know was that all this time the internal attenuation system of OpenAL was working quite fine and the problem with the initial sound tests had been the stereo sound files. Now the little trick was that the linear attenuation algorithms do not work well, or at all, on the XP operating system but everything is fine on Vista. No sound was heard on XP machines for this reason.</p>
<p>Setting the attenuation technique to inverse again fixed this problem, the problem which made us traverse a full circle just to bring us back to the first step again, oh if we only knew the little trick about stereo files&#8230;</p>
<p>Lesson #1: No stereo sound files for dynamic objects</p>
<p>Lesson #2: The linear attenuation algorithm for sound does not work on Windows XP.</p>
<p>Lesson #3: Don&#8217;t lose your trust in a proved middleware too early.</p>
<p>Lesson #4: Sound should be integrated early in the project.</p>
]]></content:encoded>
			<wfw:commentRss>http://dev.garshasp.ir/blog/archives/141/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Sound of Life</title>
		<link>http://dev.garshasp.ir/blog/archives/117</link>
		<comments>http://dev.garshasp.ir/blog/archives/117#comments</comments>
		<pubDate>Sun, 05 Jul 2009 21:22:59 +0000</pubDate>
		<dc:creator>fassihi</dc:creator>
				<category><![CDATA[Engine]]></category>
		<category><![CDATA[sound]]></category>

		<guid isPermaLink="false">http://dev.garshasp.ir/blog/?p=117</guid>
		<description><![CDATA[Sound &#8211; this integral part of every modern entertainment form. We&#8217;ve had experiments for playing sound quite a few times up to now, all being considered more functional prototypes and proof of concepts but we hadn&#8217;t solidified the design till very recently. The basic API we use for playing sound in the game is OpenAL. [...]]]></description>
			<content:encoded><![CDATA[<p>Sound &#8211; this integral part of every modern entertainment form. We&#8217;ve had experiments for playing sound quite a few times up to now, all being considered more functional prototypes and proof of concepts but we hadn&#8217;t solidified the design till very recently.</p>
<p>The basic API we use for playing sound in the game is OpenAL. A nice wrapper has been provided for it called FreeSL. OpenAL has most of the functionality we need for Garshasp and FreeSL provides useful abstractions and memory management on top of it.</p>
<p>As part of the seamlessness initiative we are following to seamlessly load and unload the portions of the level, we needed to handle all the resources for the level and one main resource which needs good attention to, after graphics, is sound. We needed to be able to manage the memory load and unload for sound files. For this means, fhm, worked on integrating sound as an OGRE resource. (OGRE provides great ways for extending its basic features) This task proved successful and seamlessness handles sound just like any other ogre resource now using the back ground thread functionalities which OGRE provides along with its resource management, which is quite nice.</p>
<p>The play, stop, pause for inidvidual character and environment sounds is handled in the scripting layer which brings good control over to the designers.</p>
<p>There are a few challenges facing us currently regarding sound, one is to find the optimal gain and attenuation for different sound generated by different objects in the game which needs some tweaking. The second issue is to find optimal blend, fade solutions for in game sound.</p>
<p>A rather technical challenge which we might need to address soon is the idea of streaming sound during playback and not loading the whole data before play which we are currently doing. The second technical challenge is to use the decoded .ogg format in memory to playback sound, currently the .ogg compressed format gets decompressed first and then gets played in memory.</p>
<p>Being developers for the loose, chaotic world of PC game developers, I don&#8217;t know how critical the two technical issues above will really be but I&#8217;m sure we needed to optimize every bit of sound and fix the above issues if we were dancing on top of any of the console platforms. Our new system memory monitors should come in handy in the coming days.</p>
]]></content:encoded>
			<wfw:commentRss>http://dev.garshasp.ir/blog/archives/117/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Turning the wheels</title>
		<link>http://dev.garshasp.ir/blog/archives/114</link>
		<comments>http://dev.garshasp.ir/blog/archives/114#comments</comments>
		<pubDate>Sun, 21 Jun 2009 21:56:20 +0000</pubDate>
		<dc:creator>fassihi</dc:creator>
				<category><![CDATA[Engine]]></category>
		<category><![CDATA[Level Design]]></category>
		<category><![CDATA[scripting]]></category>

		<guid isPermaLink="false">http://dev.garshasp.ir/blog/?p=114</guid>
		<description><![CDATA[A very serious stage has started out recently in the development process and that is putting all the elements of the levels together and preparing for play testing. We are currently using the in house level editor, Iranvij, to setup the level physical representations, graphics, interactable environment objects and enemy NPCs. Switching to &#8220;In Game&#8221; [...]]]></description>
			<content:encoded><![CDATA[<p>A very serious stage has started out recently in the development process and that is putting all the elements of the levels together and preparing for play testing. We are currently using the in house level editor, Iranvij, to setup the level physical representations, graphics, interactable environment objects and enemy NPCs. Switching to &#8220;In Game&#8221; mode in the editor is quite handy for the designers to play test right on the spot and make sure everything goes together well.</p>
<p>The in house scripting system has reached a level which has made it almost Turing Complete, since it supports sequence for commands, some kind of primitive selection constructs and a good support for repetition.</p>
<p>All enemy AI behavior, main player behaviors (Hierarchical Finite State Machines) and environment interaction and main game logic are being handled on this scripting layer which provides a flexible framework for the game designers to prototype and test ideas. The whole mechanism needs some testing and tweaking before it becomes fully re-usable as a game scripting framework.Performance tuning this interface layer would be the next stage, as we usually plan for performance issues and follow the big saying of Knuth which is :&#8221;Early optimization is the root of all evil!&#8221;</p>
<p>On another end, we are researching into some good techniques for crash reporting so that the development team can find the problems found while the design or testing team are working a bit easier. This <a href="http://msinilo.pl/blog/?p=269" target="_blank">crash handler</a> is one in which we are investigating at the moment.</p>
]]></content:encoded>
			<wfw:commentRss>http://dev.garshasp.ir/blog/archives/114/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Unroll Necessary</title>
		<link>http://dev.garshasp.ir/blog/archives/90</link>
		<comments>http://dev.garshasp.ir/blog/archives/90#comments</comments>
		<pubDate>Sat, 30 May 2009 23:48:03 +0000</pubDate>
		<dc:creator>fassihi</dc:creator>
				<category><![CDATA[Engine]]></category>
		<category><![CDATA[Physx]]></category>

		<guid isPermaLink="false">http://dev.garshasp.ir/blog/?p=90</guid>
		<description><![CDATA[The physical representation is a parallel universe which exists for the game objects in our game engine. This representation sometimes resembles the graphical representation universe and is sometimes quite different, depending on the specific game objects. The PhysX engine now owned by Nvidia is what we use to model this physical world and simulate the [...]]]></description>
			<content:encoded><![CDATA[<p>The physical representation is a parallel universe which exists for the game objects in our game engine. This representation sometimes resembles the graphical representation universe and is sometimes quite different, depending on the specific game objects. The PhysX engine now owned by Nvidia is what we use to model this physical world and simulate the behaviors of its elements, the physical objects.</p>
<p>The way we use physics is mostly for collision detections, rigid body dynamics, character controller and cloth simulation. Ragdolls have been in the wish-list for quite a while and we’d have to see if they can be included in the time available, pre-production testing was ok but real production testing hasn’t been followed yet.</p>
<p>There are two main types of physical objects. The ones which are simulated and update their graphical representations based on their coordinates and those which get updated from the graphical representations. An example of the first group is a falling box, its simulated physical representation would be updated and based on its position in the world, a graphical representation for it would be rendered every frame to indicate where it should be in space at that time. An example for the second group would be a physical representation for the weapon of a character which would be used for collision detections. The location of this object would be specified from the location of a character bone moving by its prebuilt animation, an example where a physical object would be updated from a graphical object.</p>
<p>Seems quite straight forward but here is where the fun begins. In the main game loop, input is processed first, followed by the game object logics which include animations updates, followed by the physics update and finally the render. Everything works great in this case for the second group mentioned above since the graphics gets updated, ex. Animation bones, and then in the physics update, the physical representations are adjusted at the correct positions. However for a box falling down, the updated positions for the frame would be updated in the physics update invocation which happens after the game object updates, meaning it would not be possible for the graphical representation to adjust itself with physics and graphics would fall one frame behind. This problem can be negligible in most cases since when all graphics is one frame behind, it would be hard to really realized this and find out the one frame (about 0.03 secs or less) difference between physics and graphics.</p>
<p>The above “escape strategy” would last only until you try to increase the levels of interactions in the scene. Like trying to lift a character on a platform. A platform which is animated. Lets trace the main loop. Game objects get updated, the platform is animated so the new position for this platform is specified, a physical representation is attached to this platform so in the same update process, the physics will be set at the correct position to follow the animation, the actual position in the physics world will not be updated until the next physics update which is due very shortly. Physics update executes, the new position for the platform is calculated, the character controllers are now processed to find the new positions for the dynamic characters, since our character is on this platform, the position of its character controller will resolve to somewhere on top of this platform, and the physics update ends, rendering starts. A platform will be rendered in the correct location but our character, which happens to have a correct position for its character controller, is really one frame behind since it gets synchronized with the physical representation during the main Game Object update loop which happened before the physics update. This would cause our character to go inside the rising platform and be one frame behind in the graphics world. This problem can be modeled by functions and we can say that in the above scenario a physical object is the function of a graphical object and another graphical object is the function of a physical object. physic = f(graphics) and graphics = g(physics) Now we are trying to solve these equations at the same time in one iteration, leading to problems.<br />
A work around to this issue can be to have a post physics update loop after the physics simulation to re-adjust the objects which got affected by the changes in physical objects. This could be the last step in the main game loop right before rendering. This way it would be possible for our character to look synchronized with everything else.</p>
<p>Again the above would only be valid if the main character does not contain any physical objects attached to it. If there are physical objects attached to the main player, such as weapons or ragdoll capsules, the re-adjusting the characters graphics location after the physics update can find out the new positions for these physical objects and set the position but this position will not be really applied to the physical world until the next physics simulation update, which will be in the next frame, having another physics simulation update in the existing frame will complicate things and cause a few unwanted issues and cause the same loop to go on forever.</p>
<p>One solution to this final problem configuration would be to be able to divide the physics simulation update loop into one where the actual dynamic objects get simulated and another which can specify the exact kinematic object positions, the second loop would not need any notions of time and time-steps. This is of course something we cannot do since we do not own the source code for PhysX and we’d have to continue and deal with it as a black box for the time being and look for workarounds.</p>
]]></content:encoded>
			<wfw:commentRss>http://dev.garshasp.ir/blog/archives/90/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Parallel tasks</title>
		<link>http://dev.garshasp.ir/blog/archives/82</link>
		<comments>http://dev.garshasp.ir/blog/archives/82#comments</comments>
		<pubDate>Fri, 22 May 2009 15:25:00 +0000</pubDate>
		<dc:creator>fassihi</dc:creator>
				<category><![CDATA[Engine]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[state machines]]></category>

		<guid isPermaLink="false">http://dev.garshasp.ir/blog/?p=82</guid>
		<description><![CDATA[One feature which has been implemented in the Hierarchical State Machine we are using is the ability to support parallel states. Parallel states are usually drawn on a standard state chart diagram such as UML state charts inside one state, separated by dashed lines. A fork would be necessary to divide the incoming transition into [...]]]></description>
			<content:encoded><![CDATA[<p>One feature which has been implemented in the Hierarchical State Machine we are using is the ability to support parallel states. Parallel states are usually drawn on a standard state chart diagram such as UML state charts inside one state, separated by dashed lines. A fork would be necessary to divide the incoming transition into two parallel tasks. Leaving any of these states would cause a transition from the parallel state.</p>
<p>An example for using parallel states in the AI of the main character would be a jump state where it would be possible to move forward during jump if the player presses the navigation keys. In this case, the jump itself is composed of a few sub states and a parallel state. Moving or not moving are the sub states for one of the parallel states. Pressing any navigation key would cause the player to move, change to moving state, not pressing causes a transition to stationary state. These change of states happen in parallel to the rest of the states related to jumping action.</p>
<p>This way it would be possible to add more dimensions to the behaviors of the PCs or NPCs.</p>
<p>Since we currently have a symbol definition system in order to store and evaluate variable values in the scripting system, it would be possible to simulate the join feature after a parallel state also. A join would mean that the process would stop continuing until both of the parallel sub states have made the exit transition. The two praralle states set a symbol upon exit, this join state would check for the two symbols, if they are both set, it would make the exit transition. Haven&#8217;t used this feature yet.</p>
]]></content:encoded>
			<wfw:commentRss>http://dev.garshasp.ir/blog/archives/82/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Resource Loop</title>
		<link>http://dev.garshasp.ir/blog/archives/80</link>
		<comments>http://dev.garshasp.ir/blog/archives/80#comments</comments>
		<pubDate>Sat, 16 May 2009 20:23:20 +0000</pubDate>
		<dc:creator>fassihi</dc:creator>
				<category><![CDATA[Editor]]></category>
		<category><![CDATA[Engine]]></category>
		<category><![CDATA[Iranvij]]></category>
		<category><![CDATA[Resources]]></category>
		<category><![CDATA[Seamlessness]]></category>
		<category><![CDATA[Zorvan]]></category>

		<guid isPermaLink="false">http://dev.garshasp.ir/blog/?p=80</guid>
		<description><![CDATA[One major task for the development of the necessary features in the engine which is being worked on currently is the resource management section, as part of the Seamlessness feature which is being added to the game engine. Resources being 3D mesh files, textures, sound files, animation files and etc. The seamlessness design breaks down [...]]]></description>
			<content:encoded><![CDATA[<p>One major task for the development of the necessary features in the engine which is being worked on currently is the resource management section, as part of the Seamlessness feature which is being added to the game engine. Resources being 3D mesh files, textures, sound files, animation files and etc. The seamlessness design breaks down the world into chunks which will be loaded and unloaded in real time as the player progresses through the levels. This should ideally help out in increasing the rendering performance, memory consumption and general cpu load due to processes related to AI, physics and animation.</p>
<p>This feature needs to connect to the resource management hooks and start getting tested. It will probably be the last major architecturally significant feature we&#8217;ve planned for at this stage of development. The most critical resources are related to graphics and we&#8217;re currently trying to investigate the mechanisms OGRE provides for handling resources. One major feature which is definitely necessary for this design is the background resource load/unload which OGRE supports in a separate application thread.</p>
<p>Necessary tools to be integrated for Seamlessness and Resource Management are buing built into the level editor (Iranvij) in parallel.</p>
<p>Implementing this feature properly needs great focus from the technical and artistic departments since it affects the processes and main workflows of both.</p>
]]></content:encoded>
			<wfw:commentRss>http://dev.garshasp.ir/blog/archives/80/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Time, Only Time</title>
		<link>http://dev.garshasp.ir/blog/archives/63</link>
		<comments>http://dev.garshasp.ir/blog/archives/63#comments</comments>
		<pubDate>Sat, 09 May 2009 00:25:52 +0000</pubDate>
		<dc:creator>Yaser Zhian</dc:creator>
				<category><![CDATA[Engine]]></category>
		<category><![CDATA[code]]></category>
		<category><![CDATA[implementation]]></category>

		<guid isPermaLink="false">http://dev.garshasp.ir/blog/?p=63</guid>
		<description><![CDATA[Keeping track of time is a sensitive problem. Designing a good system for this is in many ways one of the most basic and most crucial tasks a game engine developer has. One of those reasons is that most (if not all) game engines are simulation engines at their core, and without proper treatment and [...]]]></description>
			<content:encoded><![CDATA[<div style="text-align: justify">
<p>Keeping track of time is a sensitive problem. Designing a good system for this is in many ways one of the most basic and most crucial tasks a game engine developer has. One of those reasons is that most (if not all) game engines are simulation engines at their core, and without proper treatment and handling of time, nothing much else can be handled and treated.</p>
<p>Anyways, I&#8217;m not going to talk about the design of the time system that we have implemented in Zorvan (that&#8217;s what we call our game engine) but about its implementation, and not about the whole implementation, but rather about how we actually read time from the system, and the pitfalls and problems associated with it.</p>
<p>Basically, when writing your programs in C, and on Windows on a PC, you have a range of options for reading the absolute time. The basis of this absolute time is not important because we&#8217;ll be only working with time deltas and almost never with the value of the time itself. Let&#8217;s call each source of such an absolute time a &#8220;time source&#8221;(!)</p>
<p>There are a few parameters that we should be concerned about in a time source. The first is precision, or the frequency or the smallest time value that a source can actually measure or the number of meaningful digits in its return value. The next one is update interval. For example, a time source may advertise that it can measure time in micro- or nano-seconds, but its value may only change (or be updated) once a millisecond. The third parameter is the overhead of actually reading time from that source. If a source writes data to a disk file only and you have to read it from there, it&#8217;s gonna be no good for you whether it measures time in femto-seconds or not. You are going to spends a few of milliseconds reading that number anyway (if not more) and it won&#8217;t be any good then.</p>
<p>The two last parameters can be combined, but I thought I&#8217;d make a distinction because they are indeed different to a programmer and they present different symptoms in the application.</p>
<p>OK, enough with the pep talk. Let&#8217;s get down to business. The most obvious time sources are the CRT <tt>clock()</tt>, the Win32 API&#8217;s <tt>GetTickCount()</tt> and Win32 Multimedia API&#8217;s <tt>timeGetTime()</tt>. These are all millisecond sources. That is, they all make you think they have a millisecond precision. On Windows, the first two actually have an awful update interval of 15-16 milliseconds (any old DOS programmer should find this number painfully nostalgic!) Of course, this number is not fixed and you should not assume it, but I don&#8217;t remember if I&#8217;ve encountered any different behaviour by these two calls in recent years. The multimedia timer actually does have a millisecond precision and it does indeed get updated every actual millisecond, but the API documentation asserts that you cannot count on that either (but you can ask the OS to make an effort to update this timer value in any multiples of milliseconds that you want.) Among these three, the multimedia timer has the lowest call overhead and is obviously the best of them.</p>
<p>In any case, the millisecond precision is not enough for most of the timing needs of a game engine, but it <em>may</em> be enough for the most important usage of measuring the frame time. For games that have a framerate of below 100, this should work out well enough for now, but you should think about future too. In general, don&#8217;t use this to measure any duration less that a second in a game.</p>
<p>The next time source on Windows, with more precision and supposedly better behavior is the high performance timer (or whatever) aka <tt>QueryPerformanceCounter()</tt>. This allegedly uses one of the hardware counters on your system and produces close to microsecond (or a little better) precision. However, the frequency of this time source is not fixed from system to system and you have to query it from the OS (using <tt>QueryPerformanceFrequency()</tt>) at runtime, but it is guaranteed not to change when the computer is up and running (I don&#8217;t know what happens across system hibernations, and frankly, I don&#8217;t want to find out!) On  my system, and a few others I&#8217;ve tested the frequency of this source is 3&#8217;579&#8217;545 ticks per second, which gives it a precision of 280 nanoseconds. However, as my tests show, each invocation of this function takes about 2 microseconds (on my test system) which makes the overhead and latency rather high compared to its precision. However, for timing frames and other non-critical code (out of your precious inner loops) this is probably your best bet.</p>
<p>One largely curious (not to mention disturbing) behavior we observed recently in Zorvan while using this method for time calculations was that it returned the same number in two invocations a few milliseconds apart which resulted in all sorts of erratic behaviors, but I haven&#8217;t had time to investigate it in depth and I haven&#8217;t been able to reproduce it again.</p>
<p>Perhaps the most pervasive method for obtaining time in game engines is using the <tt>rdtsc</tt> (Read Time-Stamp Counter) x86 instruction which has been available since the era of 486 CPUs. It has no parameters and returns the number of CPU cycles past since the CPU was restarted as a 64-bit number in EDX:EAX. The instruction is lightweight and low-overhead, has a very high precision (almost the highest precision possible, because any lapse of time smaller that the CPU clock cycle is hardly meaningful or measurable in general computing) and is available everywhere (on all PCs anyway, and it&#8217;s obviously not Windows-specific.)</p>
<p>For those who are afraid on inline assembly, there is even a convenient intrinsic available in Visual C++ (include &#8220;intrin.h&#8221; and call <tt>__rdtsc()</tt>. (The GCC inline assembly call is left as an exercise for the reader!)</p>
<p>But nothing is free. There are a few problems attached to the use of <tt>rdtsc</tt> which fall mostly in two categories: problems caused by multi-CPU systems and those that result from its interaction with CPU power saving schemes.</p>
<p>In multi-CPU systems (including multi-cores,) the different CPUs may not have started counting cycles at exactly the same time which causes the results read from different cores slightly different. This can happen very easily because normally your code can run on different CPUs at different times (across task switches) and can cause the time seen by the application to appear to go backwards (imagine trying to feed a negative time delta to your physics solver!) The first time I saw this was on an AMD Athlon 64 X2. Although the problem is solvable with an official patch from AMD, it has left me always afraid from having to encounter it again! In any case, I haven&#8217;t seen this problem on Intel Core architecture CPUs and I haven&#8217;t had access to an AMD Phenom or Opteron to test.</p>
<p>The second group of problems happen because modern CPUs may vary and adapt their clock rates at runtime to different loads (this is quite common on mobile-class CPUs specially.) This means than when you measure your CPU clock rate (e.g. at the start of your application) it may not stay the same during the lifetime of your application and may go down or up (if the initial load was low.) In either case, it will wreak havoc on your time calculations.</p>
<p>But don&#8217;t despair! While solving the first issue is rather ugly in applications (you have to bind your time-reading thread to a single CPU (it&#8217;s called setting the &#8220;CPU affinity&#8221; for that thread; STFW yourselves)) and solving the second problem is impractical in application code, the second category of problems can be solved rather easily (IMHO) in the CPU itself. It just has to always report the time-stamp counter value according to the highest clock rate. And I suspect that CPUs actually do this, because I haven&#8217;t encountered problems of the second category yet and it&#8217;s only a theoretical problem for me for the time being. I may just be lucky, but I don&#8217;t believe so!</p>
</p>
<p>Anyway, for your reference and purposes of comparison, I have measured some relevant timing values for the 5 time sources discussed above on my laptop (a T7200 CPU, i.e Inter Core 2 Duo, 2GHz) which is presented in the table below:</p>
<table border="1">
<tr>
<th>Time Source</th>
<th>Call Overhead (microseconds)</th>
<th>Minimum Value Jump In Two Successive Calls</th>
<th>Frequency (Hz)</th>
<th>Precision (MHz<sup>-1</sup>)</th>
</tr>
<tr>
<td>clock()</td>
<td>0.03822</td>
<td>15</td>
<td>1000</td>
<td>1000</td>
</tr>
<tr>
<td>GetTickCount()</td>
<td>0.00313</td>
<td>15</td>
<td>1000</td>
<td>1000</td>
</tr>
<tr>
<td>timeGetTime()</td>
<td>0.02026</td>
<td>1</td>
<td>1000</td>
<td>1000</td>
</tr>
<tr>
<td>QueryPerformanceCounter()</td>
<td>1.921</td>
<td>5</td>
<td>3579545</td>
<td>0.279365</td>
</tr>
<tr>
<td>rdtsc</td>
<td>0.000516</td>
<td>60</td>
<td>2000000000</td>
<td>0.0005</td>
</tr>
</table>
<p><small>*: Each number is measured and averaged over about 100 million iterations.</small><br />
There are of course a couple of more methods of reading time from other hardware sources, but their availability and parameters are rather system dependent and I won&#8217;t go into the topic anymore.
</div>
]]></content:encoded>
			<wfw:commentRss>http://dev.garshasp.ir/blog/archives/63/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Brains</title>
		<link>http://dev.garshasp.ir/blog/archives/65</link>
		<comments>http://dev.garshasp.ir/blog/archives/65#comments</comments>
		<pubDate>Fri, 08 May 2009 22:07:03 +0000</pubDate>
		<dc:creator>fassihi</dc:creator>
				<category><![CDATA[Engine]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[state machines]]></category>

		<guid isPermaLink="false">http://dev.garshasp.ir/blog/?p=65</guid>
		<description><![CDATA[For the main decision making process of the AI in the game, we use the good old state machines. Very recently we tried to make these state machines data driven so that it could be tweaked by the designers easily and provide us with much more flexibility. The grand design was borrowed from an interesting [...]]]></description>
			<content:encoded><![CDATA[<p>For the main decision making process of the AI in the game, we use the good old state machines. Very recently we tried to make these state machines data driven so that it could be tweaked by the designers easily and provide us with much more flexibility.</p>
<p>The grand design was borrowed from an interesting article in <a href="http://www.amazon.com/AI-Game-Programming-Wisdom-CD/dp/1584505230" target="_blank">AI Game Programming Wisdom4</a> about hierarchical dynamic state charts. It has been implemented now in the engine and integrated with the in house scripting system from the top layer and the core Game Object component system from the bottom layer. All NPCs benefit from the data driven state machines now which use a basic XML file for their configurations. This xml definition is the <a href="http://www.w3.org/TR/2005/WD-scxml-20050705/" target="_blank">standard SCXML</a> used for defining state charts, like the UML state charts. A graphical tool has been implemented in Iranvij (the world editor) as part of the Behavior Tools we had planned to build. The state chart nodes can be created and connected using this graphical tool and once the architecture for the state chart is set, events can be set on the transitions and scripts added to the state entry/exit functions.</p>
<p>The functionality seems to work fine currently, once concern is performance as usual since this decoupled flexible design is not necessarily in line with improving performance (is there any design goal for software in line with improving performance at all?). For this we would have to profile a busy scene and really find out.</p>
<p>This new design transfers a lot of game logic outside the main engine code and into the scripting system. The scripting system we use currently requires simple commands to be developed in the engine and called from the script, we have to see how far we can go with this design, we will eventually either extend our scripting features or switch to a standard scripting language such as Lua or Angelscript and integrated that with the engine code.</p>
<p>The only dude in the game which is not data driven as far as the AI currently is the infamous Garshasp himself. Once we’re sure everything is going good with the new design, Garshasp will delegate its behavior definitions to a few data files outside the executable binary.</p>
]]></content:encoded>
			<wfw:commentRss>http://dev.garshasp.ir/blog/archives/65/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Curse of the Polygon</title>
		<link>http://dev.garshasp.ir/blog/archives/44</link>
		<comments>http://dev.garshasp.ir/blog/archives/44#comments</comments>
		<pubDate>Sun, 03 May 2009 03:47:46 +0000</pubDate>
		<dc:creator>fassihi</dc:creator>
				<category><![CDATA[Engine]]></category>
		<category><![CDATA[hardware skinning]]></category>
		<category><![CDATA[vertex shader]]></category>

		<guid isPermaLink="false">http://dev.garshasp.ir/blog/?p=44</guid>
		<description><![CDATA[The number of polygons in a 3D model was a main concern for us when we started off the development for Garshasp. After a little experience and testing with our rendering engine, OGRE, it was apparent that the number of polygons wasn’t as critical as we thought and other factors such as rendering batch counts [...]]]></description>
			<content:encoded><![CDATA[<p><!--[endif]--></p>
<p class="MsoNormal">The number of polygons in a 3D model was a main concern for us when we started off the development for Garshasp. After a little experience and testing with our rendering engine, OGRE, it was apparent that the number of polygons wasn’t as critical as we thought and other factors such as rendering batch counts had more effects on draining the frame rate. In the second wave of our development, we relaxed the polygon counts a little bit and aimed at higher polygon art.</p>
<p class="MsoNormal">This trend continued until very recently when <a href="http://yaserzt.com/blog" target="_blank">yzt</a>, our Director of Tools and Technology, profiled the system using Intel VTune and surprisingly found out that currently one main bottleneck for system performance is the skinning calculations done for animating the 3D characters.</p>
<p class="MsoNormal">We are currently using software skinning which means all the calculations are happening on the CPU. Every added polygon for a model means added vertices and this would mean that a new computation has to be made on the frame to calculate its correct position using the character skeleton bone positions. The more polygons would mean more calculations per frame.</p>
<p class="MsoNormal">This bottleneck would be another reason to select lower polygon models but before doing so we tried out hardware skinning in which the calculations for vertex positions are handled in a vertex shader. Embracing the massive parallel architecture of the GPU, these calculations can be handled very efficiently and the initial tests by yzt so far proved to be successful. We need to delve into it a bit more and our next challenge would be to limit the number of bones in character skeletons to comply with vertex shader limitations.</p>
<p class="MsoNormal">We’ll re-enter the loop to attack the next performance eater after this issue is solved.</p>
]]></content:encoded>
			<wfw:commentRss>http://dev.garshasp.ir/blog/archives/44/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Behavior Tool &#8211; Seamlessness</title>
		<link>http://dev.garshasp.ir/blog/archives/13</link>
		<comments>http://dev.garshasp.ir/blog/archives/13#comments</comments>
		<pubDate>Sun, 26 Apr 2009 20:34:11 +0000</pubDate>
		<dc:creator>fassihi</dc:creator>
				<category><![CDATA[Editor]]></category>
		<category><![CDATA[Engine]]></category>
		<category><![CDATA[code]]></category>

		<guid isPermaLink="false">http://dev.garshasp.ir/blog/?p=13</guid>
		<description><![CDATA[We have felt the need for a specific tool in our Editor for so long now. Something to enable us to focus on a specific character and tweak its properties. Whether its behaviors, animaiton, physics or any other property. We never quite dedicated the time to implemente the features we needed. However some bursts of [...]]]></description>
			<content:encoded><![CDATA[<p>We have felt the need for a specific tool in our Editor for so long now. Something to enable us to focus on a specific character and tweak its properties. Whether its behaviors, animaiton, physics or any other property. We never quite dedicated the time to implemente the features we needed. However some bursts of inspiration struck me recently when I went over the Havok Behavior Tool which is a very nice and neat product.</p>
<p>Our plan for this week is to implement a few features in our editor, Iranvij, which will enable us to view a character in an isolated window, run the different animations with a time tracker and simulate this character alone to view the different behaviors from different states. This will be a big addition to Iranvij. The state chart for every character can be edited visually thanks to the new data driven Hierarchical Finite State Machines used for the characters and the visual graph features in our editor.</p>
<p>Another major feature which is being added to the game engine is the Seamlessness feature which will enable huge game worlds to be loaded and unloaded on the fly without any load screens halting the game experience. The main parts have already been implemented in the game engine, Zorvan, and now the necessary tools have to be added to Iranvij. This is going to be a big addition to the engine capabilities.</p>
]]></content:encoded>
			<wfw:commentRss>http://dev.garshasp.ir/blog/archives/13/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Render Pipeline Rewrite</title>
		<link>http://dev.garshasp.ir/blog/archives/10</link>
		<comments>http://dev.garshasp.ir/blog/archives/10#comments</comments>
		<pubDate>Sun, 26 Apr 2009 13:06:20 +0000</pubDate>
		<dc:creator>Yaser Zhian</dc:creator>
				<category><![CDATA[Engine]]></category>
		<category><![CDATA[code]]></category>
		<category><![CDATA[design]]></category>
		<category><![CDATA[graphics]]></category>

		<guid isPermaLink="false">http://dev.garshasp.ir/blog/?p=10</guid>
		<description><![CDATA[We&#8217;re in the process of redesigning and rewriting the CPU and GPU-side code for Garshasp&#8217;s rendering pipeline. What I&#8217;m thinking about (and what we more-or-less have already) is an HDR, per-pixel-lit rendering with (pretty slow) shadow maps. The new pipeline that I&#8217;m thinking about is like this: Shadow Pass (once per shadow-casting light that affects [...]]]></description>
			<content:encoded><![CDATA[<p>We&#8217;re in the process of redesigning and rewriting the CPU and GPU-side code for Garshasp&#8217;s rendering pipeline. What I&#8217;m thinking about (and what we more-or-less have already) is an HDR, per-pixel-lit rendering with (pretty slow) shadow maps.<br />
The new pipeline that I&#8217;m thinking about is like this:</p>
<ol>
<li>Shadow Pass (once per shadow-casting light that affects the frustum):<br />
Render all shadow-casting geometry from the light view point and keep all the depth values in the render target.<br />
Hardware requirements: FP32 or (at least) FP16 texture support. I can pack the depth value in an A8R8G8B8 texture if this turns out to be a limiting requirement.<br />
Notes: OGRE handles this by default, and it probably does a better job than me (I never got the hang of all the different methods for PSM frustum calculations. <img src='http://dev.garshasp.ir/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  )
</li>
<li>Depth pass:<br />
Do an initial rendering of all opaque geometry to initialize the Z-buffer and also write the view-space pixel depths to a floating-point texture (because we SM3.0-era PC-developers can&#8217;t read from the Z-buffer directly.)<br />
Hardware requirements: FP32 or (at least) FP16 texture support. However, I can pack the depth value in an A8R8G8B8 texture if this turns out to be a limiting requirement.<br />
Notes: I should investigate interactions of this with MSAA.
</li>
<li>Shadow Map Generation Pass:<br />
Render all shadow-receiving geometry and calculate whether each pixel is in shadow or not (and how much) using the information form the last two passes. The PCF and gang should be run here. If we decide to allow multiple shadow casting lights, we can do this calculation for four of them in this pass and write the result in different components of a single A8R8G8B8 render target. More will need MRT.<br />
Hardware Requirements: Nothing special.<br />
Notes: OGRE claims that it handles this by default. But it also provides an &#8220;Integrated Shadow&#8221; option which should give me more control (and more chances to mess things up!) I should think about whether I can integrate this pass with the last one. The only problems I see are the MRT prerequisite and the different render target bit-depth requirements (8 vs. 32.)
</li>
<li>Render Pass:<br />
Render the glorious scene including the translucent objects and objects that need special treatment (fog volumes, light volumes, water,) using the shadow map generated in the last pass and the depth values from the pass before that (for the special effects.)<br />
Hardware Requirements: Will probably need SM3.0 or SM2.x if we want to support more than one or two lights. This is preferred to turning this into multiple passes, because of the number of our triangles and the large number of animations (rather costly vertex programs.) Also, will require FP_ARGB_16 for the render target; I&#8217;m hoping that every SM2.x card supports this with ease.<br />
Notes: </li>
<li>HDR Bloom:<br />
Very effective visually, although requires many, many passes (the number is also partly dependent on output resolution, because of the down-sampling.)<br />
Hardware Requirements: SM3.0 allows to dramatically reduce the number of required passes (from 10-15 to 4-5.) But I don&#8217;t think I have time to write two sets of shaders. Will have to see what our recommended requirements would be.<br />
Note: Other effects can be achieved while we are at it here (e.g. glow maps.)
</li>
<li>Tonemapping:<br />
In effect, this is integrated into the last pass, but it&#8217;s too important not to get its own pass!<br />
Hardware Requirements: Nothing special that I can see.<br />
Notes: I need to read more on this. The few methods I have tried give great results but only on certain situations.
</li>
</ol>
<p>The easiest way of putting all this together is with a many-pass compositor. Yet OGRE does not let me access the intermediate textures easily (but I&#8217;ve seen this in the OGRE compositor demo! How is that done?!) Maybe I will be forced to put them one after the other in code, or at least generate the compositors in code, which is beneficial any way (easier adaptation to hardware and user config, etc.)</p>
<p>More to come.</p>
]]></content:encoded>
			<wfw:commentRss>http://dev.garshasp.ir/blog/archives/10/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>

