A Survey of Machine Learning Techniques for Road Detection

Size: px

Start display at page:

Download "A Survey of Machine Learning Techniques for Road Detection"

Kevin Glenn
6 years ago
Views:

1 A Survey of Machine Learning Techniques for Road Detection Jake Stolee Department of Computer Science University of Toronto Yuan Wang Department of Computer Science University of Toronto Abstract An important subproblem within the field of autonomous vehicle research is road detection. In this project, we built and experimented with a road detection pipeline that: (1) aggregates the pixels in each photograph into superpixels, (2) computes relevant features at the superpixel level, (3) performs classification at the superpixel level, (4) translates the classification results back to the pixel level and (5) performs additional enhancements to the classification using structured prediction. Using this pipeline, we performed extensive experiments with various machine learning algorithms to determine which algorithms and configurations resulted in the most effective models. 1 Introduction Road detection is an important subproblem within the field of autonomous driving research. In addition to cameras, a practical road detection system may use data from multiple sensors, including radar and LIDAR, as well as GPS and vehicle inertial measurement units ( IMU ). As outlined in [1], although data from multiple sensors may be available, there is still great interest in improving visual based road detection techniques. High quality cameras and fast image processing equipment are now readily available. Furthermore, visual based road detection remains a technical challenge as autonomous vehicles may traverse populated areas and visual road detection algorithms must be highly reliable and capable of adapting in real time to diverse road conditions including illumination, visibility and obstructions [1]. Recently, there have been significant improvements to image pre processing, feature selection and the machine learning algorithms used for road detection [1]. In [8], the authors propose methods for learning a fusion of colour planes that minimizes the inner class variance of road pixels. In [9], the authors introduce a novel algorithm for adaptively computing an illumination invariant colour feature space to address large variability in illumination (eg. shadows). Multiple novel machine learning approaches have also been proposed; these include: segmenting the scene based on surface orientation [4], convolutional neural networks [8], using conditional random fields [7] and Markov random fields [11], SVM based pixel level classifiers combined with a morphological road model [12], and using Gabor filters to detect the direction of the road and estimate the vanishing point [6]. Some approaches also include temporal integration, which takes into account the fact that the road image taken from a car mounted camera should change smoothly with time [1]. The goal of this project was to build a road detection pipeline and to analyze the effectiveness of various machine learning algorithms in detecting roads in photos taken from vehicle mounted cameras. This paper outlines both the pipeline and the experimental results that were observed.

2 2 Outline of our road detection pipeline In this project, we built a pipeline that consists of the following components: (1) aggregating the pixels in the photograph into superpixels, (2) computing relevant features at the superpixel level, (3) performing classification at the superpixel level, (4) translating the classification results back to the pixel level and (5) performing additional image level enhancements. We performed experiments within each stage of the pipeline; this included observing the effect of varying the number of superpixels per image and constructing different features sets for the superpixels. We also experimented extensively with using various machine learning algorithms to classify the pixels, and attempted to use a simple Markov Random Field to enhance our predictions. 2.1 Dataset and initial observations We use the labelled training dataset of photos taken from car mounted cameras provided by the authors of the KITTI Benchmark Suite [2]. In addition to these photos, the authors also provided photos taken from right mounted camera, Velodyne laser point data and GPS and IMU data. In this project, we have only used the photos taken from the left mounted camera. The provided dataset included categories um_road (urban marked road), umm_road (urban multilane road) and uu_road (urban unmarked road). In this project, we trained our classifier and make predictions on all three categories. We first visually inspected the training set to obtain an understanding of the road detection challenge. We noted that all of the photos were taken during the day and under clear weather. We also noted that all training examples were paved roads that are directly in front of the car. We expect these regularities in the data to make the classification task simpler than the more challenging situations described in [1, 5, 6, 9, 12]. However, we note that the training examples still contained significant variability in terms of shadows introduced by trees, cars and nearby buildings. The roads also varied significantly in size, shape, direction, and number/location of obstacles (eg. cars). 2.2 Aggregation of pixels into superpixels The training images are 375 x 1242 pixel coloured PNG images. Therefore, treating each pixel as a training example would be computationally prohibitive. In order to make the road detection problem computationally tractable on a consumer laptop, we aggregate the pixels in the image into superpixels. There are several alternatives for aggregating the pixels including: using a fixed sized window, simple linear iterative clustering ( SLIC ) and several graph based and gradient based algorithms as outlined in [3]. In this project, we chose to aggregate the pixels using a SLICO implementation provided by the Scikit Image library, which is a variant of SLIC that automatically adapts the compactness parameter for each superpixel to achieve smooth regular superpixel shapes. SLICO uses k means clustering based on the RGB value and (x,y) position of the pixels. We expected SLICO to perform well because, based on visual experiments, the boundary of the superpixels appear to align with the edge of roads. We experimented with varying the number of superpixels per image. 2.3 Feature selection/engineering As mentioned earlier, significant progress has been made recently with respect to constructing informative features for road detection. In [8], [9], and [12], the authors proposed using fusion of colour planes that minimize within class variance of roads, illumination invariant colour features and gabor filters for detecting road texture and orientation, respectively. In this project, we have experimented with relatively simple features. In the future, we can augment our feature set with the more complex features mentioned above. Our results with the simple features can serve as a benchmark for evaluating these more complex features. We experimented with the following two sets of features: (1) Basic features: average RGB value, average (x,y) position and maximum difference in x position and y position of the pixels within a superpixel.

3 (2) Augmented features: the basic feature set plus average hue saturation value ( HSV ), variance in RGB, average grayscale entropy, and frequency of edge pixels contained within the superpixel. We expected the average RGB value to be highly relevant for road detection as it is an important part of the human visual system for detecting roads [1]. We also expected the average (x,y) position to be highly informative, since the roads in the training examples were directly in front of the car and have similar shape. We thought the maximum difference in x position and y position may be helpful since the shape/size of the superpixel may be affected by whether it contains road pixels. HSV is a colour space where hue, saturation and intensity are expressed in separate components, which can allow our classification algorithm to use the hue and saturation information separately from intensity. While several papers propose more advanced techniques for handling illumination differences [8], we thought HSV could be a benchmark for these more advanced techniques. We also included a number of features that measure the texture/edge/complexity within each superpixel. Our visual experiments with these features indicated that they may be useful for distinguishing road pixels from non road pixels. 2.4 Superpixel level classification algorithms We experimented with the following machine learning algorithms for classifying road pixels at the superpixel level. We used the implementation of these algorithms provided by the Scikit Learn Python library. Single Classifier Algorithms: 1) Logistic Regression: Logistic regression is a very simple and fast linear classifier. In this paper, it serves as a baseline benchmark for more powerful algorithms. 2) K NN: K NN is a simple algorithm with relatively robust assumptions. We expect that it will have good classification performance. However, K NN is likely too slow for real time applications without significant modifications. We will use K NN as an additional benchmark for more complex algorithms. 3) Decision Trees: Decision trees is a powerful and relatively robust non parametric classifier that uses fixed features to distinguish road and non road superpixels. We experimented with this model because we expected it to perform well with non linear decision boundary and because it serves as a popular base classifier for ensemble methods. 4) Mixture of Gaussians: Mixture of Gaussians is a generative model that uses a mixture of Gaussians to model the distribution of superpixels in each class (road and not road) in the feature space. We experimented with this model because we expected it to perform well with non linear decision boundary and also provide us with useful insights in the distribution of superpixels of each class. 5) Neural Net: Neural network is a complex and powerful algorithm that uses a learned representation of the training examples to discriminate between road and non road superpixels. There are many complex architectural decisions to be made in constructing a neural network. For example, state of the art image recognition neural networks often include convolution layers such as the convolutional neural net outlined in [8]. Due to limited resources for experimenting with different neural net architectures, we chose to experiment with relatively simple 1 and 2 hidden fully connected neural networks. Ensemble Methods: 1) Random forest and Extra Trees: A random forest builds a forest of decision trees by repeatedly selecting a random subset of features and a random sample of training data, and training a decision tree using this subset of features and training data. The Extra Trees algorithm also randomly selects a subset of features and sample of data to train each tree, and randomly selects a subset of features and threshold values to consider at each decision node. We chose to experiment with random forests because we expect it to perform better than decision tree and expected it to give useful insights about the features that were selected. 2) Bagging (decision tree): Bagging with decision trees builds an ensemble classifier by repeatedly selecting a random sample of training data and training a decision tree using each subset of data.

4 We chose to experiment with bagging with decision trees as we expect it to perform better than decision tree alone, and we expect it to give us useful insights about variability of training data. 3) AdaBoost (decision tree): We used AdaBoost to build an ensemble of sequentially trained decision trees. During training, the algorithm iteratively re weights training examples based on examples that the previous model classified correctly and incorrectly, and trains the subsequent decision tree on the weighted training examples. The ensemble of decision trees then combine their vote on new examples based on their relative weighted accuracy on the training set. We chose to experiment with AdaBoost decision tree to see if simple boosting could significantly improve the classification performance. 2.5 Projecting superpixel level results back to pixels After a superpixel level classifier makes a prediction, we project the results back to individual pixels. Each pixel contained within the superpixel is assigned the same overall probability of being road. 2.6 Structured predictions / image level enhancements The superpixel level classifier in our pipeline only considers information local to a superpixel (eg. position, RGB value). Intuitively, road regions respect certain structural constraints (the road regions are generally simply connected, and have limited curvature and relatively smooth boundaries). We expect that our classifier should be able to achieve better classification results if it can enforce some of these constraints. In recent literature, geometric line/region fitting, region growing and hole fitting techniques have been popular [1, 12], as well as structured prediction models based on conditional random fields and markov random fields [7, 8]. In this project, we implement a simple Markov Random Field ( MRF ) that tries to capture the intuitive assumption that neighbouring pixels are more likely to belong to the same class. We have not tried to directly capture more global structural properties of road regions, but we hope some of these properties may emerge from the local interactions between neighbouring pixels. This stage of our classification pipeline could be extended with more powerful techniques in the future. We experimented with two MRF formulations in our project: 1) Basic MRF formulation (from Chapter 8 of Bishop, 2006 [13]): E(x, y) β δ(x i, x j ) + (x, y ) (i,j)εc δ i i i where x i is the proposed class label for the pixel and yi is the label predicted by the superpixel classifier 2) We also propose an alternative formulation, assuming the probability density of a particular labelling of pixels takes on the following form: p(x) p C (X)p U (X), where p C (X) = exp{ βδ(xi, x j )} and p C (X) = p(x ) (i,j)εc where p(xi) is the probability of the pixel being road estimated by the superpixel classifier Then we define a corresponding E(x, y) β δ(x i, x j ) + log(y ) We note that the performance of MRF is highly dependent on a good choice of β and a good choice of optimization algorithm. We implemented Markov Chain Monte Carlo to optimize the prediction using simulated annealing. 3 Experimental results In this section, we present the results of our experiments with various machine learning algorithms and the classification pipeline. The model performance metrics outlined in this paper make use of the evaluation tools made available through the KITTI Vision Benchmark Suite [2]. (i,j)εc x i i i i i

3.1 Visual examples To give some visual intuition of the road detection problem, we present in Figure 1 an example image on which our final classifier performed well (top) and an example image on

We see that our final classifier performs well on relatively straight roads, but struggles with curved roads or ambiguous regions.

5 3.1 Visual examples To give some visual intuition of the road detection problem, we present in Figure 1 an example image on which our final classifier performed well (top) and an example image on which our final classifier struggled (bottom). We include the ground truth (blue) on the left and our prediction (red) on the right. Both of these images are taken from the validation set. We see that our final classifier performs well on relatively straight roads, but struggles with curved roads or ambiguous regions. Figure 1: Example ground truth classifications (left) and predictions (right) 3.1 Dataset and Evaluation Metrics In order to evaluate the performance of the algorithms as we experiment different number of superpixels, different feature sets, superpixel level classifiers, and structured prediction/image level enhancement, we chose to use the following metrics based on the KITTI Benchmark: the maximum f score ( MaxF ) achieved over a range of thresholds, average precision over the entire range of recall values ( Avg Prec ), recall and precision determined based on the threshold determined for MaxF. The authors of [2] argue that MaxF and Avg Prec are informative metrics for the optimal performance and average performance of road detection algorithms. Therefore, we adopted these metrics for our project. We will use these pixel level metrics when analyzing models. We randomly divided the 289 labelled training photos into a training set (60%), validation set (10%) and test set (30%). We used the validation set to select the best set of hyperparameters for each algorithm and for selecting the best classification algorithm. We noted that the validation set was relatively small, which could limit the statistical significance of our cross validation results. We could remedy this with k fold cross validation. However, due to limited computational resources, we only used simple cross validation in this project. 3.2 Impact of number of superpixels on classification performance We noted that aggregating the pixels into superpixels and making predictions at the superpixel level is expected to introduce discretization error. To understand the impact of discretization error, we varied the number of superpixels per image and compared the pixel level performance of a perfect oracle superpixel classifier. The perfect oracle superpixel classifier simply computes the frequency of road pixels within a superpixel and assigns that value as the probability of being road to each pixel within the superpixel. We see that the error introduced by segmenting the training data into superpixels decreases significantly as we increased number of superpixels from 100 to 10,000. There was a slight improvement as we increased the number of superpixels from 10,000 to 15,000, and no improvement afterwards. We repeated the experiment with a strong superpixel level classifier (AdaBoost decision tree with 25 estimators and max tree depth of 7). We noted a similar trend in pixel level performance. Therefore, the ideal number of superpixels per image is likely around 10,000 per image. The results from both of these experiments can be seen in Figure 2 below.

6 Figure 2: Performance of classifier vs. number of superpixels per image 3.3 Impact of feature selection As discussed in Section 2.3, we experimented with two sets of features: (1) basic and (2) augmented. We used the Extremely Randomized Trees ( Extra Trees ) feature importance metric provided by the Scikit Learn library to assess the relevance of the features. The Extra Trees feature importance generates multiple random trees (500 in our case) and measures the average depth of the decision node using each feature to determine the relative importance of these features. See Figure 3. We make the following observations: The (x, y) position of the pixel appears highly relevant for discriminating whether it is a road pixel; The H component in HSV colour space and the B component in the RGB colour space also appear to be very relevant for discriminating whether a pixel is road; The rest of the components of HSV colour space and RGB colour space, the average entropy and average frequency of edge pixels appear to be of moderate relevance; The variance in RGB colour space and the size of the superpixel appears to be of limited relevance. We performed limited experiments with reweighting the features based on their importance but was not able to significantly improve the classification results. Therefore, we have used the full basic feature set and the full augmented feature set in our experiments. 3.4 Comparison of superpixel level classifiers Figure 3: Feature Importance We experimented with the hyperparameters for each superpixel level classifier using both the basic and augmented datasets with 5000 superpixels per image. We describe our experiments with each classifier in greater detail in following subsections. We selected the hyperparameters for each superpixel classifier based on the MaxF and Avg Prec achieved on the validation set s um_road and umm_road categories. In Figure 4, we compare the MaxF and Avg Prec of each superpixel classifier using the optimal hyperparameters. We see that most superpixel level classifiers achieved similar performance, with neural net having a small lead followed closely by ensemble methods. We re ran our experiments with the 15,000 superpixel datasets and found most algorithms had very similar performance.

3.4.1 Logistic Regression Figure 4: Comparison of superpixel classifier performance As expected, logistic regression had relatively poor classification rate when compared to more powerful algorithms.

We noted that with basic features only, logistic regression performed very poorly on the validation set (for example, on um_road, logistic regression had a MaxF of 68.

7 3.4.1 Logistic Regression Figure 4: Comparison of superpixel classifier performance As expected, logistic regression had relatively poor classification rate when compared to more powerful algorithms. Through experimentation, we found regularization term of λ = 1 to work well. We noted that with basic features only, logistic regression performed very poorly on the validation set (for example, on um_road, logistic regression had a MaxF of 68.27, while more powerful algorithms had MaxF in the range). Logistic regression performed significantly better with the augmented feature set (for example, logistic regression had a MaxF of on um_road, which is an 5.56 point improvement). This is expected, as logistic regression can only find linear decision boundaries in the fixed feature space. The data appears to be highly non linearly separable in the basic feature space, but more linearly separable in the augmented feature space K NN As expected, K NN had very competitive classification performance when compared to more complex algorithms. It should be noted, however, that K NN is very slow at prediction time (approximately 1 second per image at 5000 superpixels and 50 seconds per image at 15,000 superpixels). K NN relies on a relatively robust assumption that the target classification of the pixels varies smoothly over the metric space defined on our feature space. K NN is sensitive to the scale of the features; we performed limited experiments with rescaling the features based on importance, but did not find significant improvements. We experimented with K NN on the basic feature set by varying k from 1 to 99. We noted that the K NN s classification performance on the validation set improved significantly as we increased k from 1 to 15 (MaxF on the um_road increased from to 89.99) and slightly as we increased k from 15 to 51 (MaxF on the um_road increased from to 90.50). This suggested that the training data is relatively dense in the basic feature space and we can increase k without introducing significant bias. This also suggest that the road and non road pixels are comingled within the basic feature space, and having a large k reduces the risk that the prediction is significantly affected by a few outliers. We did not experiment with K NN on the augmented feature set due to limited computational resources Decision Tree When designing a decision tree, we need to consider the criterion for choosing the feature to split on at each decision node, as well as maximum depth of the tree. We experimented with Gini and entropy and found very similar results (with Gini having a slight edge). We also experimented with varying the max depth of the tree from 2 to 10 on the basic feature set. We found that the decision tree s performance on the validation set improved significantly as we increased the max depth to 10. For example, with a max depth of 2, the model first achieved a MaxF score of on the um_road category, which grew to a value of when the maximum depth of the tree was increased to 10. We did not observe negative effects of overfitting as we increased the max depth up to 10, likely due to the abundance of training data.

8 3.3.4 Mixture of Gaussians Mixture of Gaussians had competitive classification performance on the validation set as compared with other complex models. We first experimented with the basic feature set and tried varying the number of components from 1 to 100. We observed significant improvement in classification performance when we increased the number of components from 1 to 20 (MaxF on um_road increased from to 87.23) and slight improvement from 20 to 100 (MaxF on um_road increased from to 88.72). The improvement in performance as we increased number of components to a relatively high amount of suggests that the distribution of superpixels within each class is highly complex and multi modal, and is better approximated as we increase the number of Gaussians in the mixture. The fact that we did not observe the detrimental effects of overfitting with mixture of 100 components may be due to the abundance of training data. We experimented with training MoG on the augmented features and did not observe any significant improvement in performance Neural Net The neural net classifier demonstrated the best classification performance on the validation set (though it only slightly outperformed some of the ensemble methods). There are many architectural decisions in constructing a neural net. We have only explored a limited set of architectural options. We set the activation function of the hidden layer(s) as tanh and the activation function of the output layer as soft max. We experimented with neural networks with 1 and 2 fully connected hidden layers and varied the number of hidden units in each layer. We also experimented with changing the weight decay. Overall, we found that the neural net s classification performance on the validation set increased significantly as we decreased the weight decay. For example, with weight decay set at 0.01, the neural net had a very low MaxF of on the um_road category, across the various number of hidden units (similar to logistic regression). Once we set the weight decay to a nominal amount of , a neural net with one hidden layer of 50 units was able to achieve a MaxF of on um_road. We experimented with having two hidden layers and found a neural net with 2 hidden layers (25 units in the first layer and 15 units in the second layer) had the highest MaxF on the validation set for um_road and umm_road of and 93.38, respectively Ensemble Methods We experimented with parallel training ensemble methods (bagging with decision trees, random forest and Extra Trees) and sequential training ensemble method (AdaBoost with decision trees). We experimented with varying the max depth of the trees from 1 to 15 and the number of estimators from 5 to Extra Trees and Random Forest Multiple Extra Trees and Random Forest models were trained with a Gini split criterion, and both Extra Trees and Random Forest exhibited very similar performance. Overall, both algorithms saw an improvement in performance as the max depth of the trees and number of estimators are increased. When the max depth of the tree is set to a low value, the performance significantly improves with the number of estimators. When the max depth of the trees are high, the number of estimators has a smaller impact. Overall, 25 estimators with a max depth of 15 seemed to achieve the highest MaxF values on the validation set Bagging We observed significant improvement in the classification performance as we increased the max depth of the tree from 1 to 7 and minimal improvement afterwards. We noted that varying the number of estimators had limited effect of classification performance. This suggests that our training data set is sufficiently large that subsampling does not significantly affect the classifier AdaBoost We observed significant improvement in classification rate as we increased the max depth of the trees from 1 to 5 and moderate improvement up to a max depth of 7. We saw that when we set the max depth of the tree to a low value (1 to 7), the performance increase with the number of estimators. When the max depth of a tree is set to a high value, the performance increase is less significant. We observed a very slight dip in performance for AdaBoost with 200 estimators and max tree depth of 15, which may suggest overfitting,

9 however the dip may not be significant. Based on our experiments with each superpixel classifier by varying the relevant hyperparameters and comparing the classification performance on the validation set, we determined that the following hyperparameters appeared to work well for each classifier: Algorithm Logistic Regression Table 1: Summary of hyperparameters selected through cross validation Hyperparameters λ = 1 (weight regularization term) K NN K = 51 neighbours MoG K = 100 components Decision Tree Split criterion = Gini, max_depth = 10 Neural Net 2 fully connected hidden layers (50 hidden units in first layer, 25 hidden units in second layer), tanh activation function, nominal weight decay ( ) Extra Trees n_estimators = 25, max_depth = 15 RandomForest n_estimators = 25, max_depth = 15 Bagging (Decision Tree) n_estimators = 25, max_depth = 15 AdaBoost (Decision Tree) n_estimators = 15, max_depth = Structured prediction / image level enhancements As discussed in Section 2.6, we implemented and experimented with using MRF in an effort to improve the classification performance. Due to limited computational resources, we were only able to experiment with limited number of hyperparameter settings for MRF ( β = 0.25, 0.5 and 1) and simulated annealing. We observed visually that our implementation of MRF appeared to reduce the amount of mislabelled pixels. See Figure 5 for a comparison of the original prediction (left) and prediction enhanced with MRF (right). However, as shown in Table 2, MRF did not appear to improve the classifier s MaxF score or Avg Prec. We think this may be due to the fact that the MRF implementation assigns a hard label of road or not road to each pixel, so it does not benefit from the choice of an optimal threshold in the computation of MaxF. Figure 5: Comparison of original prediction and prediction after MRF Table 2: Summary of hyperparameters selected through cross validation UM Road UMM Road Max F Avg Prec Max F Avg Prec Original Prediction MRF our formulation MRF Bishop

10 3.5 Our final classifier Based on our experimental results, we have found the following settings for our classifier to provide the best MaxF and Avg Prec metrics on the validation set: (1) aggregate pixels into 10,000 superpixels per image, (2) compute the (augmented) feature set mentioned previously, and (3) use a two layer neural network with 50 hidden units in the first layer and 25 hidden units in the second layer. We present the result of our final algorithm on the test set in Table 4 below: Table 4 Evaluation of final classifier on test set: Dataset MaxF AvgPre Recall_wp Pre_wp UM Road UMM Road UU Road Conclusion In this project, we built and experimented with a classification pipeline for detecting road pixels in photos taken from a car mounted camera. We experimented with each stage of the pipeline, and investigated the impact of varying the number of superpixels per image and constructing different feature sets. We also experimented extensively with different machine learning algorithms for classifying the superpixels, and performed some experiments with using structured predictions/mrf to enhance our classification results. Based on our experiments, we designed a well performing classifier, which we report in Section 3.5. References [1] Hillel, Aharon Bar et al. "Recent progress in road and lane detection: a survey." Machine Vision and Applications 25.3 (2014): [2] Fritsch, Joerg, Tobias Kuhnl, and Andreas Geiger. "A new performance measure and evaluation benchmark for road detection algorithms." Intelligent Transportation Systems (ITSC), th International IEEE Conference on 6 Oct. 2013: [3] Achanta, Radhakrishna et al. "SLIC superpixels compared to state of the art superpixel methods." Pattern Analysis and Machine Intelligence, IEEE Transactions on (2012): [4] Hoiem, Derek, Alexei A Efros, and Martial Hebert. "Recovering surface layout from an image." International Journal of Computer Vision 75.1 (2007): [5] Alvarez, José M, Mathieu Salzmann, and Nick Barnes. "Learning appearance models for road detection." Intelligent Vehicles Symposium (IV), 2013 IEEE 23 Jun. 2013: [6] Kong, Hui, Jean Yves Audibert, and Jean Ponce. "General road detection from a single image." Image Processing, IEEE Transactions on 19.8 (2010): [7] Xiao, Liang et al. "CRF based road detection with multi sensor fusion." Intelligent Vehicles Symposium (IV), 2015 IEEE 28 Jun. 2015: [8] Alvarez, Jose M et al. "Road scene segmentation from a single image." Computer Vision ECCV 2012 (2012): [9] Álvarez, José M, and Antonio M Ĺopez. "Road detection based on illuminant invariance." Intelligent Transportation Systems, IEEE Transactions on 12.1 (2011): [10] Kato, Zoltan, and Ting Chuen Pong. "A Markov random field image segmentation model for color textured images." Image and Vision Computing (2006): [11] Guo, C., Mita, S., & McAllester, D. (2012). Robust road detection and tracking in challenging scenarios based on Markov random fields with unsupervised learning. Intelligent Transportation Systems, IEEE Transactions on, 13(3), [12] Zhou, S., Gong, J., Xiong, G., Chen, H., & Iagnemma, K. (2010). Road detection using support vector machine based on online learning and evaluation. Intelligent Vehicles Symposium (IV), 2010 IEEE. IEEE. [13] Bishop, Christopher M. Pattern Recognition and Machine Learning. Springer, 2006.

Ryerson University CP8208. Soft Computing and Machine Intelligence. Naive Road-Detection using CNNS. Authors: Sarah Asiri - Domenic Curro

Ryerson University CP8208. Soft Computing and Machine Intelligence. Naive Road-Detection using CNNS. Authors: Sarah Asiri - Domenic Curro Ryerson University CP8208 Soft Computing and Machine Intelligence Naive Road-Detection using CNNS Authors: Sarah Asiri - Domenic Curro April 24 2016 Contents 1 Abstract 2 2 Introduction 2 3 Motivation