HOG+SVM实现行人检测(一) 图像切割

1,切割正、负样本图像,并把图片名存为txt

#include <iostream>
#include <iostream>
#include <fstream>
#include <stdlib.h> //srand()和rand()函数
#include <time.h> //time()函数
#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/objdetect/objdetect.hpp>
#include <opencv2/ml/ml.hpp>

#define INRIANegativeImageList "/Users/macbookpro/CLionProjects/pedestrian_detection/img_dir/sample_neg.txt" //原始负样本图片文件列表
#define cropNegNum 1214
//负样本图片个数
using namespace std;
using namespace cv;

int CropImageCount = 0; //裁剪出来的负样本图片个数

int main()
{
   Mat src;
   string ImgName;

   char saveName[256];//裁剪出来的负样本图片文件名
   ifstream fin(INRIANegativeImageList);//打开原始负样本图片文件列表
   //ifstream fin("subset.txt");
   int num = 0;

   //一行一行读取文件列表
   while(getline(fin,ImgName))
   {
      cout<<"处理:"<<ImgName<<endl;
      ImgName = "/Users/macbookpro/CLionProjects/pedestrian_detection/normalized_images/train/neg/" + ImgName;

      src = imread(ImgName,1);//读取彩色图片

      //src =cvLoadImage(imagename,1);//c的接口,
      // 例子:IplImage *pic = cvLoadImage("timu1.jpg",1); cvShowImage("load",pic);
      //cout<<"宽:"<<src.cols<<",高:"<<src.rows<<endl;

      //图片大小应该能能至少包含一个64*128的窗口
      if(src.cols >= 64 && src.rows >= 128)
      {
         srand(time(NULL));//设置随机数种子

         //从每张图片中随机裁剪10个64*128大小的不包含人的负样本
         for(int i=0; i<10; i++)
         {
            int x = ( rand() % (src.cols-64) ); //左上角x坐标
            int y = ( rand() % (src.rows-128) ); //左上角y坐标
            //cout<<x<<","<<y<<endl;
            Mat imgROI = src(Rect(x,y,64,128));
            //Rect rect(400,400,300,300);
            // Mat image_cut = Mat(img, rect);
            sprintf(saveName,"/Users/macbookpro/CLionProjects/pedestrian_detection/normalized_images/train/new_neg/noperson%06d.jpg",++CropImageCount);//生成裁剪出的负样本图片的文件名
            imwrite(saveName, imgROI);//保存文件

             //保存裁剪得到的图片名称到txt文件,换行分隔
                if(i<(cropNegNum-1)){
                    fout <<"neg" << i << "Cropped"<< num++ << ".png"<< endl;
                }
                else if(i==(cropNegNum-1) && j<4){
                    fout <<"neg" << i << "Cropped"<< num++ << ".png"<< endl;
                }
                else{
                    fout <<"neg" << i << "Cropped"<< num++ << ".png";
                }
         }
      }
   }
  cout<<"总共裁剪出"<<CropImageCount<<"张图片"<<endl;

}

 

已标记关键词 清除标记
相关推荐
Human parsing has been extensively studied recently (Yamaguchi et al. 2012; Xia et al. 2017) due to its wide applications in many important scenarios. Mainstream fashion parsing models (i.e., parsers) focus on parsing the high-resolution and clean images. However, directly applying the parsers trained on benchmarks of high-quality samples to a particular application scenario in the wild, e.g., a canteen, airport or workplace, often gives non-satisfactory performance due to domain shift. In this paper, we explore a new and challenging cross-domain human parsing problem: taking the benchmark dataset with extensive pixel-wise labeling as the source domain, how to obtain a satisfactory parser on a new target domain without requiring any additional manual labeling? To this end, we propose a novel and efficient crossdomain human parsing model to bridge the cross-domain differences in terms of visual appearance and environment conditions and fully exploit commonalities across domains. Our proposed model explicitly learns a feature compensation network, which is specialized for mitigating the cross-domain differences. A discriminative feature adversarial network is introduced to supervise the feature compensation to effectively reduces the discrepancy between feature distributions of two domains. Besides, our proposed model also introduces a structured label adversarial network to guide the parsing results of the target domain to follow the high-order relationships of the structured labels shared across domains. The proposed framework is end-to-end trainable, practical and scalable in real applications. Extensive experiments are conducted where LIP dataset is the source domain and 4 different datasets including surveillance videos, movies and runway shows without any annotations, are evaluated as target domains. The results consistently confirm data efficiency and performance advantages of the proposed method for the challenging cross-domain human parsing problem. Abstract—This paper presents a robust Joint Discriminative appearance model based Tracking method using online random forests and mid-level feature (superpixels). To achieve superpixel- wise discriminative ability, we propose a joint appearance model that consists of two random forest based models, i.e., the Background-Target discriminative Model (BTM) and Distractor- Target discriminative Model (DTM). More specifically, the BTM effectively learns discriminative information between the target object and background. In contrast, the DTM is used to suppress distracting superpixels which significantly improves the tracker’s robustness and alleviates the drifting problem. A novel online random forest regression algorithm is proposed to build the two models. The BTM and DTM are linearly combined into a joint model to compute a confidence map. Tracking results are estimated using the confidence map, where the position and scale of the target are estimated orderly. Furthermore, we design a model updating strategy to adapt the appearance changes over time by discarding degraded trees of the BTM and DTM and initializing new trees as replacements. We test the proposed tracking method on two large tracking benchmarks, the CVPR2013 tracking benchmark and VOT2014 tracking challenge. Experimental results show that the tracker runs at real-time speed and achieves favorable tracking performance compared with the state-of-the-art methods. The results also sug- gest that the DTM improves tracking performance significantly and plays an important role in robust tracking.
©️2020 CSDN 皮肤主题: 岁月 设计师:pinMode 返回首页